Bank Churn Prediction

Problem Statement

Context

Businesses like banks which provide service have to worry about problem of 'Customer Churn' i.e. customers leaving and joining another service provider. It is important to understand which aspects of the service influence a customer's decision in this regard. Management can concentrate efforts on improvement of service, keeping in mind these priorities.

Objective

You as a Data scientist with the bank need to build a neural network based classifier that can determine whether a customer will leave the bank or not in the next 6 months.

Data Dictionary

  • CustomerId: Unique ID which is assigned to each customer

  • Surname: Last name of the customer

  • CreditScore: It defines the credit history of the customer.

  • Geography: A customer’s location

  • Gender: It defines the Gender of the customer

  • Age: Age of the customer

  • Tenure: Number of years for which the customer has been with the bank

  • NumOfProducts: refers to the number of products that a customer has purchased through the bank.

  • Balance: Account balance

  • HasCrCard: It is a categorical variable which decides whether the customer has credit card or not.

  • EstimatedSalary: Estimated salary

  • isActiveMember: Is is a categorical variable which decides whether the customer is active member of the bank or not ( Active member in the sense, using bank products regularly, making transactions etc )

  • Exited : whether or not the customer left the bank within six month. It can take two values 0=No ( Customer did not leave the bank ) 1=Yes ( Customer left the bank )

In [ ]:
!pip install tensorflow

Importing necessary libraries

In [1]:
# Libraries to help with reading and manipulating data
import numpy as np
import pandas as pd
import matplotlib
import sklearn
# libaries to help with data visualization
import matplotlib.pyplot as pyplot
import matplotlib.pyplot as plt
import seaborn as sns
# Library to split data
from sklearn.model_selection import train_test_split
# library to import to standardize the data
from sklearn.preprocessing import StandardScaler

#Imports functions for evaluating the performance of machine learning models
from sklearn.metrics import confusion_matrix, f1_score,accuracy_score, recall_score, precision_score, classification_report
import keras
#Importing classback API
from keras import callbacks
# Importing tensorflow library
import tensorflow as tf
import time  # Module for time-related operations.
# importing different functions to build models
from tensorflow.keras.layers import Dense, Dropout, InputLayer
from tensorflow.keras.models import Sequential

# Importing backend
from tensorflow.keras import backend

# Importing optimizers
from keras.optimizers import Adam

# Library to avoid the warnings
import warnings
warnings.filterwarnings("ignore")

Loading the dataset

In [2]:
#mounting Google Drive
from google.colab import drive
drive.mount('/content/drive')
Mounted at /content/drive
In [3]:
#reading the dataset
df = pd.read_csv('/content/drive/MyDrive/Python/bank-1.csv')
In [4]:
#make a copy of data
data=df.copy()

Data Overview

In [5]:
#top 5 records of the data set
data.head()
Out[5]:
RowNumber CustomerId Surname CreditScore Geography Gender Age Tenure Balance NumOfProducts HasCrCard IsActiveMember EstimatedSalary Exited
0 1 15634602 Hargrave 619 France Female 42 2 0.00 1 1 1 101348.88 1
1 2 15647311 Hill 608 Spain Female 41 1 83807.86 1 0 1 112542.58 0
2 3 15619304 Onio 502 France Female 42 8 159660.80 3 1 0 113931.57 1
3 4 15701354 Boni 699 France Female 39 1 0.00 2 0 0 93826.63 0
4 5 15737888 Mitchell 850 Spain Female 43 2 125510.82 1 1 1 79084.10 0

Observations

  • In the above dataset, Exited is the target Variable with binay output values and this will be a Binary classification problem with two possible outcomes
In [6]:
#checking data info
data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 14 columns):
 #   Column           Non-Null Count  Dtype  
---  ------           --------------  -----  
 0   RowNumber        10000 non-null  int64  
 1   CustomerId       10000 non-null  int64  
 2   Surname          10000 non-null  object 
 3   CreditScore      10000 non-null  int64  
 4   Geography        10000 non-null  object 
 5   Gender           10000 non-null  object 
 6   Age              10000 non-null  int64  
 7   Tenure           10000 non-null  int64  
 8   Balance          10000 non-null  float64
 9   NumOfProducts    10000 non-null  int64  
 10  HasCrCard        10000 non-null  int64  
 11  IsActiveMember   10000 non-null  int64  
 12  EstimatedSalary  10000 non-null  float64
 13  Exited           10000 non-null  int64  
dtypes: float64(2), int64(9), object(3)
memory usage: 1.1+ MB

Observations

  • There are 10000 rows and 14 columns in the data
  • Out of 14 coulumns , 3 are Object type and 11 are numrical type
  • There is no missing value in the data
In [7]:
#checking statistical summary of data
data.describe()
Out[7]:
RowNumber CustomerId CreditScore Age Tenure Balance NumOfProducts HasCrCard IsActiveMember EstimatedSalary Exited
count 10000.00000 1.000000e+04 10000.000000 10000.000000 10000.000000 10000.000000 10000.000000 10000.00000 10000.000000 10000.000000 10000.000000
mean 5000.50000 1.569094e+07 650.528800 38.921800 5.012800 76485.889288 1.530200 0.70550 0.515100 100090.239881 0.203700
std 2886.89568 7.193619e+04 96.653299 10.487806 2.892174 62397.405202 0.581654 0.45584 0.499797 57510.492818 0.402769
min 1.00000 1.556570e+07 350.000000 18.000000 0.000000 0.000000 1.000000 0.00000 0.000000 11.580000 0.000000
25% 2500.75000 1.562853e+07 584.000000 32.000000 3.000000 0.000000 1.000000 0.00000 0.000000 51002.110000 0.000000
50% 5000.50000 1.569074e+07 652.000000 37.000000 5.000000 97198.540000 1.000000 1.00000 1.000000 100193.915000 0.000000
75% 7500.25000 1.575323e+07 718.000000 44.000000 7.000000 127644.240000 2.000000 1.00000 1.000000 149388.247500 0.000000
max 10000.00000 1.581569e+07 850.000000 92.000000 10.000000 250898.090000 4.000000 1.00000 1.000000 199992.480000 1.000000

Observations

  • Credit Score ranges from 350 to 850, with 50% cutomers below score of 652
  • Customers age ranges betweeen 18 to 92 years with 50% aging below 37 and 75% aging below 44
  • Tenure ranges from 0-10 years, 50% below 5 years and 75% below 7 years
  • Account Balance has a great variance from 0 to ~250K, average being ~76.5K , 50% customers having below ~97k and 75% below ~127K
  • Estimated Salaray ranges from ~11$ to ~200K $, 50% maing below ~100K and 75% belo ~150K
  • The Exited column data shows that 20.37 % of records are 1(Exited customers and 79.63 % customers are currently with the bank
In [8]:
#Check for data duplication
data.duplicated().sum()
Out[8]:
0

Observations

  • There is no duplicated data
In [9]:
# checking the number of unique values in each column
data.nunique()
Out[9]:
0
RowNumber 10000
CustomerId 10000
Surname 2932
CreditScore 460
Geography 3
Gender 2
Age 70
Tenure 11
Balance 6382
NumOfProducts 4
HasCrCard 2
IsActiveMember 2
EstimatedSalary 9999
Exited 2

Observations

  • The columns Rownumber and CustomerId's have unique values which can be dropped as they do not add any value in EDA and model building
  • Also the approximate 3000 values of Surnames will not add any value in EDA or Data Modelling even they they are not uniue, so we can drop it.
  • Geography and Gender can be treated as categorical variables which we will label encode during Data Modelling
In [10]:
#check unqiue values of  Variables
print(f"Unique values in 'Geography':",data['Geography'].unique())
print(f"Unique values in 'Gender':",data['Gender'].unique())
print(f"Unique values in 'Tenure':",data['Tenure'].unique())
print(f"Unique values in 'No of Products':",df['NumOfProducts'].unique())
Unique values in 'Geography': ['France' 'Spain' 'Germany']
Unique values in 'Gender': ['Female' 'Male']
Unique values in 'Tenure': [ 2  1  8  7  4  6  3 10  5  9  0]
Unique values in 'No of Products': [1 3 2 4]

Observations

  • From above we can infer the following variables as
    • Numerical(6) - CreditScore,Age,Tenure,Balance,EstimatedSalary,Exited,NoofProd
    • Categorical (4) - Gender,Geography,hasCard,IaActivemem,
    • Object(1) - Surname

Exploratory Data Analysis

Data Cleaning

In [11]:
#Dropping rownumber and customerid before EDA due to uniqueness in each row
# enrollee_id is unique for each candidate and might not add value to modeling
data.drop(["RowNumber","CustomerId","Surname"],axis=1, inplace=True)
#validate the dropped columns
data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 11 columns):
 #   Column           Non-Null Count  Dtype  
---  ------           --------------  -----  
 0   CreditScore      10000 non-null  int64  
 1   Geography        10000 non-null  object 
 2   Gender           10000 non-null  object 
 3   Age              10000 non-null  int64  
 4   Tenure           10000 non-null  int64  
 5   Balance          10000 non-null  float64
 6   NumOfProducts    10000 non-null  int64  
 7   HasCrCard        10000 non-null  int64  
 8   IsActiveMember   10000 non-null  int64  
 9   EstimatedSalary  10000 non-null  float64
 10  Exited           10000 non-null  int64  
dtypes: float64(2), int64(7), object(2)
memory usage: 859.5+ KB
In [12]:
data.head()
Out[12]:
CreditScore Geography Gender Age Tenure Balance NumOfProducts HasCrCard IsActiveMember EstimatedSalary Exited
0 619 France Female 42 2 0.00 1 1 1 101348.88 1
1 608 Spain Female 41 1 83807.86 1 0 1 112542.58 0
2 502 France Female 42 8 159660.80 3 1 0 113931.57 1
3 699 France Female 39 1 0.00 2 0 0 93826.63 0
4 850 Spain Female 43 2 125510.82 1 1 1 79084.10 0

Function Definitions for EDA.

In [ ]:
# function to plot a boxplot and a histogram along the same scale.


def histogram_boxplot(data, feature, figsize=(12, 7), kde=False, bins=None):
    """
    Boxplot and histogram combined

    data: dataframe
    feature: dataframe column
    figsize: size of figure (default (12,7))
    kde: whether to the show density curve (default False)
    bins: number of bins for histogram (default None)
    """
    f2, (ax_box2, ax_hist2) = plt.subplots(
        nrows=2,  # Number of rows of the subplot grid= 2
        sharex=True,  # x-axis will be shared among all subplots
        gridspec_kw={"height_ratios": (0.25, 0.75)},
        figsize=figsize,
    )  # creating the 2 subplots
    sns.boxplot(
        data=data, x=feature, ax=ax_box2, showmeans=True, color="violet"
    )  # boxplot will be created and a triangle will indicate the mean value of the column
    sns.histplot(
        data=data, x=feature, kde=kde, ax=ax_hist2, bins=bins, palette="winter"
    ) if bins else sns.histplot(
        data=data, x=feature, kde=kde, ax=ax_hist2
    )  # For histogram
    ax_hist2.axvline(
        data[feature].mean(), color="green", linestyle="--"
    )  # Add mean to the histogram
    ax_hist2.axvline(
        data[feature].median(), color="black", linestyle="-"
    )  # Add median to the histogram
In [ ]:
# function to create labeled barplots


def labeled_barplot(data, feature, perc=False, n=None):
    """
    Barplot with percentage at the top

    data: dataframe
    feature: dataframe column
    perc: whether to display percentages instead of count (default is False)
    n: displays the top n category levels (default is None, i.e., display all levels)
    """

    total = len(data[feature])  # length of the column
    count = data[feature].nunique()
    if n is None:
        plt.figure(figsize=(count + 1, 5))
    else:
        plt.figure(figsize=(n + 1, 5))

    plt.xticks(rotation=90, fontsize=15)
    ax = sns.countplot(
        data=data,
        x=feature,
        palette="Paired",
        order=data[feature].value_counts().index[:n].sort_values(),
    )

    for p in ax.patches:
        if perc == True:
            label = "{:.1f}%".format(
                100 * p.get_height() / total
            )  # percentage of each class of the category
        else:
            label = p.get_height()  # count of each level of the category

        x = p.get_x() + p.get_width() / 2  # width of the plot
        y = p.get_height()  # height of the plot

        ax.annotate(
            label,
            (x, y),
            ha="center",
            va="center",
            size=12,
            xytext=(0, 5),
            textcoords="offset points",
        )  # annotate the percentage

    plt.show()  # show the plot
In [ ]:
# function to plot stacked bar chart

def stacked_barplot(data, predictor, target):
    """
    Print the category counts and plot a stacked bar chart

    data: dataframe
    predictor: independent variable
    target: target variable
    """
    count = data[predictor].nunique()
    sorter = data[target].value_counts().index[-1]
    tab1 = pd.crosstab(data[predictor], data[target], margins=True).sort_values(
        by=sorter, ascending=False
    )
    print(tab1)
    print("-" * 120)
    tab = pd.crosstab(data[predictor], data[target], normalize="index").sort_values(
        by=sorter, ascending=False
    )
    tab.plot(kind="bar", stacked=True, figsize=(count + 1, 5))
    plt.legend(
        loc="lower left", frameon=False,
    )
    plt.legend(loc="upper left", bbox_to_anchor=(1, 1))
    plt.show()
In [ ]:
### Function to plot distributions

def distribution_plot_wrt_target(data, predictor, target):

    fig, axs = plt.subplots(2, 2, figsize=(12, 10))

    target_uniq = data[target].unique()

    axs[0, 0].set_title("Distribution of target for target=" + str(target_uniq[0]))
    sns.histplot(
        data=data[data[target] == target_uniq[0]],
        x=predictor,
        kde=True,
        ax=axs[0, 0],
        color="teal",
    )

    axs[0, 1].set_title("Distribution of target for target=" + str(target_uniq[1]))
    sns.histplot(
        data=data[data[target] == target_uniq[1]],
        x=predictor,
        kde=True,
        ax=axs[0, 1],
        color="orange",
    )

    axs[1, 0].set_title("Boxplot w.r.t target")
    sns.boxplot(data=data, x=target, y=predictor, ax=axs[1, 0], palette="gist_rainbow")

    axs[1, 1].set_title("Boxplot (without outliers) w.r.t target")
    sns.boxplot(
        data=data,
        x=target,
        y=predictor,
        ax=axs[1, 1],
        showfliers=False,
        palette="gist_rainbow",
    )

    plt.tight_layout()
    plt.show()

Univariate Analysis

Observation:CreditScore

In [ ]:
#Plot CreditScore
histogram_boxplot(data, "CreditScore")

Obeservations

  • CreditScoe is a slightly left skwerd distribution,median around 650 and some outliers below 400, and Q1-Q3 ranges 584 to 718

Observation:Age

In [ ]:
#Plot Age
histogram_boxplot(data, "Age")

Observations

  • Age is a right skewed distribution,customer density below age 50 is high and tapering after 60. There are some outliers above 60

Observation:Tenure

In [ ]:
#Plot Tenure
histogram_boxplot(data, "Tenure")

Observations

  • Customer across tenures are somewhat evently distributed between 1-9 years(~1000) ,however customers with 0 or 10 years tenure are almost half

Observation:Balance

In [ ]:
#Plot Balance
histogram_boxplot(data, "Balance")

Observations

  • Approximately 3500 customers have zero balance
  • Apart from zero balance, the rest of the distribution is normal

Observation: EstimatedSalary

In [ ]:
#Plot Estimated Salary
histogram_boxplot(data, "EstimatedSalary")

Observations

  • Esitmated Salary is evenly distributed

Observation on Gender

In [ ]:
labeled_barplot(data, "Gender",perc=True)

Observations

  • There are 54.6% male and 45.4% females in the data set

Observation on Geography

In [ ]:
labeled_barplot(data, "Geography",perc=True)

Observations

  • Half customers in Framce and rest 25% each are from Germany and Spain

Observation on HasCrard

In [ ]:
labeled_barplot(data, "HasCrCard",perc=True)

Observations

  • Aproximately 30% customers do not have card and 70% have cards

Observation on IsActiveMember

In [ ]:
labeled_barplot(data, "IsActiveMember",perc=True)

Observation on NumOfProducts

In [ ]:
labeled_barplot(data, "NumOfProducts",perc=True)

Observations

  • Approximately 51% of customers have 1 product and 46%
  • There are very few(2.7%) customers with 3 products and very few(0.6%)with 4 products

Bivariate Analysis

Pairplot

In [ ]:
sns.pairplot(data, hue ='Exited' , diag_kind='hist')
Out[ ]:
<seaborn.axisgrid.PairGrid at 0x7d02632e1e90>

Observations

  • Most customers who exited are between 40-62 age .Also customers above 40 who are not active members have higher churn
  • Customers below credit score 400 have a high density of leaving
  • All customers with 4 produts and most with 3 products have left the bank
  • Non active members with 1 product tend and active members with 2 products tend to stay

HeatMap

In [ ]:
plt.figure(figsize=(10,8))
sns.heatmap(data.corr(numeric_only = True), annot=True, vmin=-1, vmax=1, fmt=".2f", cmap="coolwarm",linewidths=0.5)
plt.show()

Observations There are only two noticable correlations :

  • No of Products and Balance are negatively co-related with a factor of 3 (-ie: when no of products increase, Balance decreases)
  • Target Variable is postively co-related with Age with a factor of 0.29

Gender wrt Target

In [ ]:
stacked_barplot(df,"Gender","Exited")
Exited     0     1    All
Gender                   
All     7963  2037  10000
Female  3404  1139   4543
Male    4559   898   5457
------------------------------------------------------------------------------------------------------------------------

observations

  • ~33% of female customers left and ~20% of male customers left

Geography wrt Target

In [ ]:
stacked_barplot(df,"Geography","Exited")
Exited        0     1    All
Geography                   
All        7963  2037  10000
Germany    1695   814   2509
France     4204   810   5014
Spain      2064   413   2477
------------------------------------------------------------------------------------------------------------------------

Observations

  • Germany has 32% customers leaving,France 16% and Spain too 16%, Germany has almost double percentage of customers leaving than other two countries

HasCrCard wrt Target

In [ ]:
stacked_barplot(df,"HasCrCard","Exited")
Exited        0     1    All
HasCrCard                   
All        7963  2037  10000
1          5631  1424   7055
0          2332   613   2945
------------------------------------------------------------------------------------------------------------------------

Observations

  • Having Card is not a single factor influencing customers leaving the bank as both customers having and not having Card have left the bank in equal ratio.

IsAtiveMember wrt Target

In [ ]:
stacked_barplot(df,"IsActiveMember","Exited")
Exited             0     1    All
IsActiveMember                   
All             7963  2037  10000
0               3547  1302   4849
1               4416   735   5151
------------------------------------------------------------------------------------------------------------------------

Observations

  • 27% of Active members left the bank wheras 14% of customers no active members have left the bank

No of Products Wrt Target

In [ ]:
stacked_barplot(df,"NumOfProducts","Exited")
Exited            0     1    All
NumOfProducts                   
All            7963  2037  10000
1              3675  1409   5084
2              4242   348   4590
3                46   220    266
4                 0    60     60
------------------------------------------------------------------------------------------------------------------------

Observations

  • cusometrs with 3,4 products have a less and they have a high rate of leaving the bank
  • Most customers with 2 produts stay.

Age Distribution wrt Target

In [ ]:
distribution_plot_wrt_target(data, "Age", "Exited")

Observations

  • Most of customers who did not leave the bank were between ~32 to ~42 age and those who left were bettween ~37 to ~52 Age

Credit Score wrt Target

In [ ]:
distribution_plot_wrt_target(data, "CreditScore", "Exited")

Observations

  • Credit score does not have a direct realtion with cusomers leaving, however there are customers with very low credit scores who have left the bank

Balance wrt Target

In [ ]:
distribution_plot_wrt_target(data, "Balance", "Exited")

Observation

  • Customers Leaving are high b/w 50000 to 170000 Balance

EstimatedSalary wrt Target

In [ ]:
distribution_plot_wrt_target(data, "EstimatedSalary", "Exited")

Observations

  • No specific outcomes on existing customers across Estimated Salary Distribution

Tenure wrt Target

In [ ]:
distribution_plot_wrt_target(data, "Tenure", "Exited")

Observation

  • Most Cusomerts who have left the bank have tenures from 2-8 years

No Of Products wrt Target

In [ ]:
distribution_plot_wrt_target(data, "NumOfProducts", "Exited")

EDA OBSERVATIONS-CONSOLIDATED

After doing univariate and Bivariate analysis, the significant points are

a. Customers in the age range of 40-62 tend to leave the bank, especially those who are not active members.

b. 50% customers are in France, however customers churn in Germany is more (approximately 33% of its customer base)

c. Other significant factors which influnce customeres leaving the bank are :

  • higher no of products( 3,4) : leave are their % is in total is 3%
  • Female members tend to leave more than male (33% vs 20%)
  • Customers with low credit scores(below 400)
  • Non Active member with more than 1 product tends to leave

Data Preprocessing

Steps Relevant in Data Proprocessing befoe model building

-Dropping of Columns which whi not add value (Customerid,RowId,Surname)-Accomplished before EDA

  • Encoding of Categorical Variables to numerical Values : Georgraphy and Gender
  • Splitting Data in Train and Test sets
  • Normalizing and Sacling Numerical Variables CreditScore, Age, Tenure, Balance, EstimatedSalary
In [13]:
#Step1: Keep a copy of Data before pre-processing
data1=data.copy()

Train-validation-test Split

In [14]:
#Splitting the target Variables and Predictors
X = data1.drop(["Exited"], axis=1)
Y = data1["Exited"]
In [15]:
# splitting the data in 80:20 ratio for train and temporary data
X_train, X_temp, Y_train, Y_temp = train_test_split(X, Y, test_size=0.2,random_state=1)
In [16]:
# splitting the temporary data in 50:50 ratio for validation and test data
X_val,X_test,Y_val,Y_test = train_test_split(X_temp,Y_temp,test_size=0.5,random_state=1)
In [17]:
print("Number of rows in train data =", X_train.shape[0])
print("Number of rows in validation data =", X_val.shape[0])
print("Number of rows in test data =", X_test.shape[0])
Number of rows in train data = 8000
Number of rows in validation data = 1000
Number of rows in test data = 1000
In [18]:
print("Number of col in train data =", X_train.shape[1])
print("Number of col in validation data =", X_val.shape[1])
print("Number of col in test data =", X_test.shape[1])
Number of col in train data = 10
Number of col in validation data = 10
Number of col in test data = 10

Dummy Variable Creation

In [19]:
# List of columns to be converted into dummy variables
categorical_columns = ['Gender', 'Geography', 'HasCrCard', 'IsActiveMember']
In [20]:
X_train = pd.get_dummies(X_train,columns=categorical_columns, drop_first=True)
X_train = X_train.astype(float)

X_val = pd.get_dummies(X_val,columns=categorical_columns, drop_first=True)
X_val = X_val.astype(float)

X_test = pd.get_dummies(X_test,columns=categorical_columns, drop_first=True)
X_test = X_test.astype(float)

print(X_train.shape, X_val.shape, X_test.shape)
(8000, 11) (1000, 11) (1000, 11)
In [21]:
# Calculate and print target distribution in each set
def print_distribution(y, set_name):
    print(f"{set_name} Distribution:")
    print(Y.value_counts(normalize=True) * 100)
    print()

print_distribution(Y_train, "Training Set")
print_distribution(Y_val, "Validation Set")
print_distribution(Y_test, "Test Set")
Training Set Distribution:
Exited
0    79.63
1    20.37
Name: proportion, dtype: float64

Validation Set Distribution:
Exited
0    79.63
1    20.37
Name: proportion, dtype: float64

Test Set Distribution:
Exited
0    79.63
1    20.37
Name: proportion, dtype: float64

Data Normalization

In [22]:
scaler = StandardScaler()

# Here, we are passing all the features (numerical and categorical), that's okay as min-max scaler will not change values of categorical variables
X_train_normalized = scaler.fit_transform(X_train)
In [23]:
X_val_normalized = scaler.transform(X_val)
In [24]:
X_test_normalized = scaler.transform(X_test)
In [25]:
X_val_normalized.shape
Out[25]:
(1000, 11)

Model Building

Model Evaluation Criterion

Write down the logic for choosing the metric that would be the best metric for this business scenario.

  • Accuracy followed by F1 Score will be the most significant criterions for evaluation based on the logic outlined below:

    -- Accurancy Since the data set is imbalanced(the no of target customers who are leaving are significantly lesser than the customers who are existing),high accuracy ie correct predictions will be the most important metric

    -- Prescision, Recall and F1 Score

    • Maximise prescision : by ensuring the predicted customers leaving the bank are actually leaving
    • Maximize Recall : Ensuring customers predicting actual customers leaving and minimizing False Negatives (scenario where cutomers are wrongly predicted to curn where they do not leave the bank)
    • F1 Score: This will give a single metric which will capture both Prescion and Recall Criteria mentioned above especially important in a imbalanced class distribution

Utility Functions

In [26]:
def plot(history, name):
    """
    Function to plot loss/accuracy

    history: an object which stores the metrics and losses.
    name: can be one of Loss or Accuracy
    """
    fig, ax = plt.subplots() #Creating a subplot with figure and axes.
    plt.plot(history.history[name]) #Plotting the train accuracy or train loss
    plt.plot(history.history['val_'+name]) #Plotting the validation accuracy or validation loss

    plt.title('Model ' + name.capitalize()) #Defining the title of the plot.
    plt.ylabel(name.capitalize()) #Capitalizing the first letter.
    plt.xlabel('Epoch') #Defining the label for the x-axis.
    fig.legend(['Train', 'Validation'], loc="outside right upper") #Defining the legend, loc controls the position of the legend.
In [27]:
# defining a function to compute different metrics to check performance of a classification model built using statsmodels
def model_performance_classification(
    model, predictors, target, threshold=0.5
):
    """
    Function to compute different metrics to check classification model performance

    model: classifier
    predictors: independent variables
    target: dependent variable
    threshold: threshold for classifying the observation as class 1
    """

    # checking which probabilities are greater than threshold
    pred = model.predict(predictors) > threshold
    # pred_temp = model.predict(predictors) > threshold
    # # rounding off the above values to get classes
    # pred = np.round(pred_temp)

    acc = accuracy_score(target, pred)  # to compute Accuracy
    recall = recall_score(target, pred, average='weighted')  # to compute Recall
    precision = precision_score(target, pred, average='weighted')  # to compute Precision
    f1 = f1_score(target, pred, average='weighted')  # to compute F1-score

    # creating a dataframe of metrics
    df_perf = pd.DataFrame(
        {"Accuracy": acc, "Recall": recall, "Precision": precision, "F1 Score": f1,},
        index=[0],
    )

    return df_perf
In [28]:
# defining the batch size and # epochs upfront as we'll be using the same values for all models
epochs = 50
batch_size = 64

Neural Network with SGD Optimizer

Since this is a binary classification problem , we will build a simple Feed Forward Neural Network

  • Let's start with a neural network consisting of
    • two hidden layers with 14 and 7 neurons respectively
    • activation function of ReLU.
    • SGD as the optimizer
In [29]:
# clears the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
In [30]:
#Initializing the neural network
model = Sequential()
model.add(Dense(14,activation="relu",input_dim=X_train_normalized.shape[1]))
#model.add(Dense(7,activation="relu"))
model.add(Dense(7,activation="relu"))
model.add(Dense(1,activation="sigmoid"))
In [31]:
model.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                          Output Shape                         Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ dense (Dense)                        │ (None, 14)                  │             168 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_1 (Dense)                      │ (None, 7)                   │             105 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_2 (Dense)                      │ (None, 1)                   │               8 │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
 Total params: 281 (1.10 KB)
 Trainable params: 281 (1.10 KB)
 Non-trainable params: 0 (0.00 B)
In [32]:
optimizer = tf.keras.optimizers.SGD()    # defining SGD as the optimizer to be used
model.compile(loss='binary_crossentropy', optimizer=optimizer,metrics=['accuracy','f1_score'])
In [33]:
start = time.time()
history = model.fit(X_train_normalized, Y_train, validation_data=(X_val_normalized,Y_val) , batch_size=batch_size, epochs=epochs)
end=time.time()
Epoch 1/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - accuracy: 0.4979 - f1_score: 0.3377 - loss: 0.7507 - val_accuracy: 0.8190 - val_f1_score: 0.3036 - val_loss: 0.5296
Epoch 2/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8006 - f1_score: 0.3362 - loss: 0.5240 - val_accuracy: 0.8270 - val_f1_score: 0.3036 - val_loss: 0.4594
Epoch 3/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8021 - f1_score: 0.3375 - loss: 0.4795 - val_accuracy: 0.8290 - val_f1_score: 0.3036 - val_loss: 0.4375
Epoch 4/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.7997 - f1_score: 0.3430 - loss: 0.4710 - val_accuracy: 0.8280 - val_f1_score: 0.3036 - val_loss: 0.4257
Epoch 5/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8048 - f1_score: 0.3367 - loss: 0.4539 - val_accuracy: 0.8320 - val_f1_score: 0.3036 - val_loss: 0.4177
Epoch 6/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8086 - f1_score: 0.3345 - loss: 0.4504 - val_accuracy: 0.8350 - val_f1_score: 0.3036 - val_loss: 0.4110
Epoch 7/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8064 - f1_score: 0.3329 - loss: 0.4440 - val_accuracy: 0.8360 - val_f1_score: 0.3036 - val_loss: 0.4052
Epoch 8/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8074 - f1_score: 0.3404 - loss: 0.4414 - val_accuracy: 0.8370 - val_f1_score: 0.3036 - val_loss: 0.4001
Epoch 9/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8093 - f1_score: 0.3430 - loss: 0.4383 - val_accuracy: 0.8420 - val_f1_score: 0.3036 - val_loss: 0.3956
Epoch 10/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8018 - f1_score: 0.3474 - loss: 0.4438 - val_accuracy: 0.8410 - val_f1_score: 0.3036 - val_loss: 0.3914
Epoch 11/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.8066 - f1_score: 0.3473 - loss: 0.4408 - val_accuracy: 0.8410 - val_f1_score: 0.3036 - val_loss: 0.3878
Epoch 12/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - accuracy: 0.8131 - f1_score: 0.3374 - loss: 0.4289 - val_accuracy: 0.8390 - val_f1_score: 0.3036 - val_loss: 0.3846
Epoch 13/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - accuracy: 0.8211 - f1_score: 0.3349 - loss: 0.4178 - val_accuracy: 0.8430 - val_f1_score: 0.3036 - val_loss: 0.3818
Epoch 14/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - accuracy: 0.8204 - f1_score: 0.3304 - loss: 0.4202 - val_accuracy: 0.8460 - val_f1_score: 0.3036 - val_loss: 0.3795
Epoch 15/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8224 - f1_score: 0.3347 - loss: 0.4107 - val_accuracy: 0.8450 - val_f1_score: 0.3036 - val_loss: 0.3773
Epoch 16/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8188 - f1_score: 0.3317 - loss: 0.4127 - val_accuracy: 0.8440 - val_f1_score: 0.3036 - val_loss: 0.3751
Epoch 17/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8196 - f1_score: 0.3390 - loss: 0.4182 - val_accuracy: 0.8440 - val_f1_score: 0.3036 - val_loss: 0.3730
Epoch 18/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8209 - f1_score: 0.3361 - loss: 0.4163 - val_accuracy: 0.8450 - val_f1_score: 0.3036 - val_loss: 0.3712
Epoch 19/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8238 - f1_score: 0.3336 - loss: 0.4041 - val_accuracy: 0.8460 - val_f1_score: 0.3036 - val_loss: 0.3697
Epoch 20/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8255 - f1_score: 0.3315 - loss: 0.4093 - val_accuracy: 0.8470 - val_f1_score: 0.3036 - val_loss: 0.3678
Epoch 21/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8253 - f1_score: 0.3317 - loss: 0.3973 - val_accuracy: 0.8480 - val_f1_score: 0.3036 - val_loss: 0.3665
Epoch 22/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8305 - f1_score: 0.3257 - loss: 0.3912 - val_accuracy: 0.8480 - val_f1_score: 0.3036 - val_loss: 0.3650
Epoch 23/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8185 - f1_score: 0.3461 - loss: 0.4092 - val_accuracy: 0.8490 - val_f1_score: 0.3036 - val_loss: 0.3632
Epoch 24/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8261 - f1_score: 0.3311 - loss: 0.3961 - val_accuracy: 0.8510 - val_f1_score: 0.3036 - val_loss: 0.3621
Epoch 25/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8284 - f1_score: 0.3364 - loss: 0.3944 - val_accuracy: 0.8500 - val_f1_score: 0.3036 - val_loss: 0.3607
Epoch 26/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8213 - f1_score: 0.3411 - loss: 0.4060 - val_accuracy: 0.8490 - val_f1_score: 0.3036 - val_loss: 0.3594
Epoch 27/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8219 - f1_score: 0.3375 - loss: 0.4008 - val_accuracy: 0.8490 - val_f1_score: 0.3036 - val_loss: 0.3578
Epoch 28/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8254 - f1_score: 0.3414 - loss: 0.3997 - val_accuracy: 0.8490 - val_f1_score: 0.3036 - val_loss: 0.3563
Epoch 29/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8289 - f1_score: 0.3334 - loss: 0.3919 - val_accuracy: 0.8500 - val_f1_score: 0.3036 - val_loss: 0.3551
Epoch 30/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8292 - f1_score: 0.3295 - loss: 0.3911 - val_accuracy: 0.8540 - val_f1_score: 0.3036 - val_loss: 0.3540
Epoch 31/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8366 - f1_score: 0.3340 - loss: 0.3839 - val_accuracy: 0.8540 - val_f1_score: 0.3036 - val_loss: 0.3528
Epoch 32/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.8366 - f1_score: 0.3349 - loss: 0.3897 - val_accuracy: 0.8580 - val_f1_score: 0.3036 - val_loss: 0.3511
Epoch 33/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.8390 - f1_score: 0.3368 - loss: 0.3855 - val_accuracy: 0.8580 - val_f1_score: 0.3036 - val_loss: 0.3496
Epoch 34/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.8434 - f1_score: 0.3322 - loss: 0.3795 - val_accuracy: 0.8610 - val_f1_score: 0.3036 - val_loss: 0.3478
Epoch 35/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - accuracy: 0.8443 - f1_score: 0.3341 - loss: 0.3774 - val_accuracy: 0.8650 - val_f1_score: 0.3036 - val_loss: 0.3460
Epoch 36/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - accuracy: 0.8434 - f1_score: 0.3304 - loss: 0.3751 - val_accuracy: 0.8680 - val_f1_score: 0.3036 - val_loss: 0.3448
Epoch 37/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8437 - f1_score: 0.3430 - loss: 0.3741 - val_accuracy: 0.8680 - val_f1_score: 0.3036 - val_loss: 0.3424
Epoch 38/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8422 - f1_score: 0.3388 - loss: 0.3797 - val_accuracy: 0.8690 - val_f1_score: 0.3036 - val_loss: 0.3406
Epoch 39/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8421 - f1_score: 0.3449 - loss: 0.3816 - val_accuracy: 0.8720 - val_f1_score: 0.3036 - val_loss: 0.3387
Epoch 40/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8516 - f1_score: 0.3378 - loss: 0.3731 - val_accuracy: 0.8710 - val_f1_score: 0.3036 - val_loss: 0.3372
Epoch 41/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8435 - f1_score: 0.3389 - loss: 0.3783 - val_accuracy: 0.8720 - val_f1_score: 0.3036 - val_loss: 0.3355
Epoch 42/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8491 - f1_score: 0.3390 - loss: 0.3739 - val_accuracy: 0.8730 - val_f1_score: 0.3036 - val_loss: 0.3338
Epoch 43/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8505 - f1_score: 0.3323 - loss: 0.3651 - val_accuracy: 0.8740 - val_f1_score: 0.3036 - val_loss: 0.3326
Epoch 44/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8498 - f1_score: 0.3386 - loss: 0.3742 - val_accuracy: 0.8740 - val_f1_score: 0.3036 - val_loss: 0.3309
Epoch 45/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8490 - f1_score: 0.3389 - loss: 0.3724 - val_accuracy: 0.8750 - val_f1_score: 0.3036 - val_loss: 0.3297
Epoch 46/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8510 - f1_score: 0.3321 - loss: 0.3640 - val_accuracy: 0.8770 - val_f1_score: 0.3036 - val_loss: 0.3279
Epoch 47/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8510 - f1_score: 0.3308 - loss: 0.3604 - val_accuracy: 0.8770 - val_f1_score: 0.3036 - val_loss: 0.3265
Epoch 48/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8522 - f1_score: 0.3381 - loss: 0.3602 - val_accuracy: 0.8770 - val_f1_score: 0.3036 - val_loss: 0.3254
Epoch 49/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8539 - f1_score: 0.3281 - loss: 0.3587 - val_accuracy: 0.8790 - val_f1_score: 0.3036 - val_loss: 0.3246
Epoch 50/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8501 - f1_score: 0.3380 - loss: 0.3667 - val_accuracy: 0.8800 - val_f1_score: 0.3036 - val_loss: 0.3232
In [34]:
print("Time taken in seconds ",end-start)
Time taken in seconds  32.0451545715332
In [35]:
plot(history,'loss')
In [36]:
plot(history,'accuracy')
In [37]:
model_0_train_perf = model_performance_classification(model, X_train_normalized, Y_train)
model_0_train_perf
250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
Out[37]:
Accuracy Recall Precision F1 Score
0 0.8545 0.8545 0.844325 0.83757
In [38]:
model_0_val_perf = model_performance_classification(model, X_val_normalized, Y_val)
model_0_val_perf
32/32 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step
Out[38]:
Accuracy Recall Precision F1 Score
0 0.88 0.88 0.872717 0.865521

Comments on Model Performance

  • Training and Validation Accurancy increased over epochs indication that model generalised and learn well to validation set
  • Training and Validation loss decreased steadity showing that model converged well
  • Validation F1-score remained constant at .3036 across epochs showing model is failing to identify true positive and reducing false negatives
  • The performance metrics (accuracy, recall, precision, F1 score) showing that the model is generalising well and not overfitting ,It is showing strong performance.

Observations

Neural Network with Adam Optimizer-M1

In [39]:
# clears the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
In [40]:
#Initializing the neural network
model_1 = Sequential()

# Input layer and the first hidden layer
model_1.add(Dense(14,activation="relu",input_dim = X_train_normalized.shape[1]))

# Second hidden layer
model_1.add(Dense(14, activation="relu"))

# third hidden layer
model_1.add(Dense(7, activation="relu"))

# output layer
model_1.add(Dense(1, activation="sigmoid"))
In [41]:
# Compile the model with Adam optimizer and a specified learning rate
#adam = Adam(learning_rate=0.001)
optimizer=keras.optimizers.Adam(learning_rate=0.001)

model_1.compile(optimizer=optimizer, loss='binary_crossentropy', metrics=['accuracy','f1_score'])
In [42]:
start = time.time()
history = model_1.fit(X_train_normalized, Y_train, validation_data=(X_val_normalized,Y_val) , batch_size=batch_size, epochs=epochs)
end=time.time()
Epoch 1/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 3s 8ms/step - accuracy: 0.6963 - f1_score: 0.3367 - loss: 0.6126 - val_accuracy: 0.8210 - val_f1_score: 0.3036 - val_loss: 0.4519
Epoch 2/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - accuracy: 0.8030 - f1_score: 0.3291 - loss: 0.4674 - val_accuracy: 0.8210 - val_f1_score: 0.3036 - val_loss: 0.4180
Epoch 3/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.7989 - f1_score: 0.3360 - loss: 0.4474 - val_accuracy: 0.8300 - val_f1_score: 0.3036 - val_loss: 0.4003
Epoch 4/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8099 - f1_score: 0.3368 - loss: 0.4294 - val_accuracy: 0.8320 - val_f1_score: 0.3036 - val_loss: 0.3861
Epoch 5/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8118 - f1_score: 0.3442 - loss: 0.4264 - val_accuracy: 0.8460 - val_f1_score: 0.3036 - val_loss: 0.3710
Epoch 6/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8197 - f1_score: 0.3445 - loss: 0.4147 - val_accuracy: 0.8510 - val_f1_score: 0.3036 - val_loss: 0.3628
Epoch 7/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8317 - f1_score: 0.3443 - loss: 0.3999 - val_accuracy: 0.8590 - val_f1_score: 0.3036 - val_loss: 0.3532
Epoch 8/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8409 - f1_score: 0.3394 - loss: 0.3851 - val_accuracy: 0.8750 - val_f1_score: 0.3036 - val_loss: 0.3412
Epoch 9/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8454 - f1_score: 0.3392 - loss: 0.3737 - val_accuracy: 0.8730 - val_f1_score: 0.3036 - val_loss: 0.3289
Epoch 10/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8479 - f1_score: 0.3390 - loss: 0.3685 - val_accuracy: 0.8770 - val_f1_score: 0.3036 - val_loss: 0.3209
Epoch 11/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8576 - f1_score: 0.3384 - loss: 0.3538 - val_accuracy: 0.8770 - val_f1_score: 0.3036 - val_loss: 0.3140
Epoch 12/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8562 - f1_score: 0.3419 - loss: 0.3457 - val_accuracy: 0.8770 - val_f1_score: 0.3036 - val_loss: 0.3107
Epoch 13/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8620 - f1_score: 0.3273 - loss: 0.3362 - val_accuracy: 0.8760 - val_f1_score: 0.3036 - val_loss: 0.3067
Epoch 14/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8568 - f1_score: 0.3340 - loss: 0.3526 - val_accuracy: 0.8790 - val_f1_score: 0.3036 - val_loss: 0.3064
Epoch 15/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8621 - f1_score: 0.3311 - loss: 0.3330 - val_accuracy: 0.8790 - val_f1_score: 0.3036 - val_loss: 0.3072
Epoch 16/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8626 - f1_score: 0.3446 - loss: 0.3432 - val_accuracy: 0.8790 - val_f1_score: 0.3036 - val_loss: 0.3086
Epoch 17/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8592 - f1_score: 0.3342 - loss: 0.3357 - val_accuracy: 0.8830 - val_f1_score: 0.3036 - val_loss: 0.3070
Epoch 18/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8591 - f1_score: 0.3354 - loss: 0.3347 - val_accuracy: 0.8810 - val_f1_score: 0.3036 - val_loss: 0.3065
Epoch 19/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8644 - f1_score: 0.3369 - loss: 0.3266 - val_accuracy: 0.8820 - val_f1_score: 0.3036 - val_loss: 0.3091
Epoch 20/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8623 - f1_score: 0.3367 - loss: 0.3334 - val_accuracy: 0.8840 - val_f1_score: 0.3036 - val_loss: 0.3035
Epoch 21/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 6ms/step - accuracy: 0.8647 - f1_score: 0.3361 - loss: 0.3320 - val_accuracy: 0.8830 - val_f1_score: 0.3036 - val_loss: 0.3036
Epoch 22/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 6ms/step - accuracy: 0.8684 - f1_score: 0.3253 - loss: 0.3221 - val_accuracy: 0.8830 - val_f1_score: 0.3036 - val_loss: 0.3075
Epoch 23/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 6ms/step - accuracy: 0.8598 - f1_score: 0.3350 - loss: 0.3428 - val_accuracy: 0.8800 - val_f1_score: 0.3036 - val_loss: 0.3053
Epoch 24/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8612 - f1_score: 0.3405 - loss: 0.3379 - val_accuracy: 0.8760 - val_f1_score: 0.3036 - val_loss: 0.3047
Epoch 25/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8651 - f1_score: 0.3306 - loss: 0.3245 - val_accuracy: 0.8810 - val_f1_score: 0.3036 - val_loss: 0.3066
Epoch 26/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8648 - f1_score: 0.3339 - loss: 0.3376 - val_accuracy: 0.8880 - val_f1_score: 0.3036 - val_loss: 0.3054
Epoch 27/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8635 - f1_score: 0.3374 - loss: 0.3339 - val_accuracy: 0.8800 - val_f1_score: 0.3036 - val_loss: 0.3059
Epoch 28/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8608 - f1_score: 0.3390 - loss: 0.3380 - val_accuracy: 0.8840 - val_f1_score: 0.3036 - val_loss: 0.3039
Epoch 29/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8636 - f1_score: 0.3329 - loss: 0.3328 - val_accuracy: 0.8820 - val_f1_score: 0.3036 - val_loss: 0.3064
Epoch 30/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8597 - f1_score: 0.3376 - loss: 0.3343 - val_accuracy: 0.8840 - val_f1_score: 0.3036 - val_loss: 0.3042
Epoch 31/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8625 - f1_score: 0.3408 - loss: 0.3343 - val_accuracy: 0.8880 - val_f1_score: 0.3036 - val_loss: 0.3022
Epoch 32/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8632 - f1_score: 0.3423 - loss: 0.3279 - val_accuracy: 0.8880 - val_f1_score: 0.3036 - val_loss: 0.3039
Epoch 33/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8682 - f1_score: 0.3343 - loss: 0.3265 - val_accuracy: 0.8850 - val_f1_score: 0.3036 - val_loss: 0.3061
Epoch 34/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8599 - f1_score: 0.3346 - loss: 0.3338 - val_accuracy: 0.8800 - val_f1_score: 0.3036 - val_loss: 0.3072
Epoch 35/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8666 - f1_score: 0.3373 - loss: 0.3242 - val_accuracy: 0.8900 - val_f1_score: 0.3036 - val_loss: 0.3030
Epoch 36/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8632 - f1_score: 0.3349 - loss: 0.3293 - val_accuracy: 0.8850 - val_f1_score: 0.3036 - val_loss: 0.3052
Epoch 37/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8636 - f1_score: 0.3369 - loss: 0.3259 - val_accuracy: 0.8900 - val_f1_score: 0.3036 - val_loss: 0.3018
Epoch 38/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8651 - f1_score: 0.3375 - loss: 0.3236 - val_accuracy: 0.8850 - val_f1_score: 0.3036 - val_loss: 0.3034
Epoch 39/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8614 - f1_score: 0.3404 - loss: 0.3295 - val_accuracy: 0.8860 - val_f1_score: 0.3036 - val_loss: 0.3040
Epoch 40/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8695 - f1_score: 0.3342 - loss: 0.3198 - val_accuracy: 0.8860 - val_f1_score: 0.3036 - val_loss: 0.3043
Epoch 41/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8580 - f1_score: 0.3531 - loss: 0.3356 - val_accuracy: 0.8830 - val_f1_score: 0.3036 - val_loss: 0.3064
Epoch 42/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - accuracy: 0.8626 - f1_score: 0.3404 - loss: 0.3313 - val_accuracy: 0.8880 - val_f1_score: 0.3036 - val_loss: 0.3041
Epoch 43/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - accuracy: 0.8624 - f1_score: 0.3329 - loss: 0.3323 - val_accuracy: 0.8870 - val_f1_score: 0.3036 - val_loss: 0.3041
Epoch 44/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 6ms/step - accuracy: 0.8606 - f1_score: 0.3496 - loss: 0.3310 - val_accuracy: 0.8830 - val_f1_score: 0.3036 - val_loss: 0.3051
Epoch 45/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8666 - f1_score: 0.3416 - loss: 0.3210 - val_accuracy: 0.8880 - val_f1_score: 0.3036 - val_loss: 0.3072
Epoch 46/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8633 - f1_score: 0.3391 - loss: 0.3301 - val_accuracy: 0.8870 - val_f1_score: 0.3036 - val_loss: 0.3053
Epoch 47/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8690 - f1_score: 0.3282 - loss: 0.3228 - val_accuracy: 0.8850 - val_f1_score: 0.3036 - val_loss: 0.3051
Epoch 48/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8628 - f1_score: 0.3408 - loss: 0.3314 - val_accuracy: 0.8870 - val_f1_score: 0.3036 - val_loss: 0.3035
Epoch 49/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8637 - f1_score: 0.3416 - loss: 0.3351 - val_accuracy: 0.8900 - val_f1_score: 0.3036 - val_loss: 0.3038
Epoch 50/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8670 - f1_score: 0.3308 - loss: 0.3251 - val_accuracy: 0.8860 - val_f1_score: 0.3036 - val_loss: 0.3067
In [43]:
print("Time taken in seconds ",end-start)
Time taken in seconds  35.62427258491516
In [44]:
plot(history,'loss')
In [45]:
plot(history,'accuracy')
In [46]:
model_1_train_perf = model_performance_classification(model_1, X_train_normalized, Y_train)
model_1_train_perf
250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
Out[46]:
Accuracy Recall Precision F1 Score
0 0.865 0.865 0.856189 0.853499
In [47]:
model_1_val_perf = model_performance_classification(model_1, X_val_normalized, Y_val)
model_1_val_perf
32/32 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step
Out[47]:
Accuracy Recall Precision F1 Score
0 0.886 0.886 0.878312 0.876533

Comments on Model Performance

  • The validation loss is consistently lesser than the training loss & Validation accuracy consistently hight than the Training accuracy which indicate good generalisation but could also be data leakage.
  • Training and Validation loss decreased steadity showing that model converged well
  • Validation F1-score remained constant at .3036 across epochs showing model is failing to identify true positive and reducing false negatives
  • The Model shows high accurancy on train & val sets,Good Precision and recall and cosistent f1 scores

Neural Network with Adam Optimizer and Dropout-M2

In [48]:
# clears the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
In [49]:
#Initializing the neural network
model_2 = Sequential()

# Input layer and the first hidden layer
model_2.add(Dense(14,activation="relu",input_dim = X_train_normalized.shape[1]))
model_2.add(Dropout(0.5))

# Second hidden layer
model_2.add(Dense(14, activation="relu"))
model_2.add(Dropout(0.5))

# third hidden layer
model_2.add(Dense(7, activation="relu"))
model_2.add(Dropout(0.5))

# output layer
model_2.add(Dense(1, activation="sigmoid"))
In [50]:
# Compile the model with Adam optimizer and a specified learning rate
#adam = Adam(learning_rate=0.001)
optimizer=keras.optimizers.Adam(learning_rate=0.001)

model_2.compile(optimizer=optimizer, loss='binary_crossentropy', metrics=['accuracy','f1_score'])
In [51]:
start = time.time()
history = model_2.fit(X_train_normalized, Y_train, validation_data=(X_val_normalized,Y_val) , batch_size=batch_size, epochs=epochs)
end=time.time()
Epoch 1/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 4s 6ms/step - accuracy: 0.5738 - f1_score: 0.3338 - loss: 0.7359 - val_accuracy: 0.8210 - val_f1_score: 0.3036 - val_loss: 0.5824
Epoch 2/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.7406 - f1_score: 0.3404 - loss: 0.6198 - val_accuracy: 0.8210 - val_f1_score: 0.3036 - val_loss: 0.5315
Epoch 3/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7931 - f1_score: 0.3264 - loss: 0.5670 - val_accuracy: 0.8210 - val_f1_score: 0.3036 - val_loss: 0.4891
Epoch 4/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.7908 - f1_score: 0.3414 - loss: 0.5468 - val_accuracy: 0.8210 - val_f1_score: 0.3036 - val_loss: 0.4656
Epoch 5/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.7982 - f1_score: 0.3332 - loss: 0.5309 - val_accuracy: 0.8210 - val_f1_score: 0.3036 - val_loss: 0.4455
Epoch 6/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.7951 - f1_score: 0.3392 - loss: 0.5166 - val_accuracy: 0.8210 - val_f1_score: 0.3036 - val_loss: 0.4368
Epoch 7/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.7873 - f1_score: 0.3512 - loss: 0.5118 - val_accuracy: 0.8210 - val_f1_score: 0.3036 - val_loss: 0.4248
Epoch 8/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.7918 - f1_score: 0.3514 - loss: 0.5027 - val_accuracy: 0.8210 - val_f1_score: 0.3036 - val_loss: 0.4199
Epoch 9/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7923 - f1_score: 0.3447 - loss: 0.5050 - val_accuracy: 0.8210 - val_f1_score: 0.3036 - val_loss: 0.4142
Epoch 10/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.8007 - f1_score: 0.3343 - loss: 0.4833 - val_accuracy: 0.8210 - val_f1_score: 0.3036 - val_loss: 0.4138
Epoch 11/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8021 - f1_score: 0.3312 - loss: 0.4853 - val_accuracy: 0.8210 - val_f1_score: 0.3036 - val_loss: 0.4113
Epoch 12/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.7921 - f1_score: 0.3457 - loss: 0.4876 - val_accuracy: 0.8210 - val_f1_score: 0.3036 - val_loss: 0.4059
Epoch 13/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7927 - f1_score: 0.3482 - loss: 0.4813 - val_accuracy: 0.8210 - val_f1_score: 0.3036 - val_loss: 0.4054
Epoch 14/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7918 - f1_score: 0.3491 - loss: 0.4839 - val_accuracy: 0.8210 - val_f1_score: 0.3036 - val_loss: 0.4052
Epoch 15/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.8000 - f1_score: 0.3375 - loss: 0.4725 - val_accuracy: 0.8210 - val_f1_score: 0.3036 - val_loss: 0.4053
Epoch 16/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7988 - f1_score: 0.3421 - loss: 0.4749 - val_accuracy: 0.8210 - val_f1_score: 0.3036 - val_loss: 0.3997
Epoch 17/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7978 - f1_score: 0.3440 - loss: 0.4726 - val_accuracy: 0.8210 - val_f1_score: 0.3036 - val_loss: 0.4020
Epoch 18/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - accuracy: 0.8041 - f1_score: 0.3305 - loss: 0.4646 - val_accuracy: 0.8210 - val_f1_score: 0.3036 - val_loss: 0.4053
Epoch 19/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - accuracy: 0.7963 - f1_score: 0.3394 - loss: 0.4708 - val_accuracy: 0.8210 - val_f1_score: 0.3036 - val_loss: 0.4014
Epoch 20/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 6ms/step - accuracy: 0.7969 - f1_score: 0.3387 - loss: 0.4649 - val_accuracy: 0.8210 - val_f1_score: 0.3036 - val_loss: 0.3986
Epoch 21/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 7ms/step - accuracy: 0.7956 - f1_score: 0.3343 - loss: 0.4661 - val_accuracy: 0.8210 - val_f1_score: 0.3036 - val_loss: 0.3997
Epoch 22/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.7955 - f1_score: 0.3420 - loss: 0.4725 - val_accuracy: 0.8210 - val_f1_score: 0.3036 - val_loss: 0.3978
Epoch 23/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7973 - f1_score: 0.3437 - loss: 0.4640 - val_accuracy: 0.8210 - val_f1_score: 0.3036 - val_loss: 0.3953
Epoch 24/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8094 - f1_score: 0.3279 - loss: 0.4477 - val_accuracy: 0.8210 - val_f1_score: 0.3036 - val_loss: 0.3958
Epoch 25/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7918 - f1_score: 0.3523 - loss: 0.4647 - val_accuracy: 0.8210 - val_f1_score: 0.3036 - val_loss: 0.3920
Epoch 26/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8079 - f1_score: 0.3390 - loss: 0.4502 - val_accuracy: 0.8210 - val_f1_score: 0.3036 - val_loss: 0.3900
Epoch 27/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8042 - f1_score: 0.3381 - loss: 0.4443 - val_accuracy: 0.8210 - val_f1_score: 0.3036 - val_loss: 0.3885
Epoch 28/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7965 - f1_score: 0.3472 - loss: 0.4654 - val_accuracy: 0.8210 - val_f1_score: 0.3036 - val_loss: 0.3882
Epoch 29/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.8035 - f1_score: 0.3430 - loss: 0.4547 - val_accuracy: 0.8210 - val_f1_score: 0.3036 - val_loss: 0.3821
Epoch 30/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8069 - f1_score: 0.3393 - loss: 0.4518 - val_accuracy: 0.8210 - val_f1_score: 0.3036 - val_loss: 0.3820
Epoch 31/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.8110 - f1_score: 0.3326 - loss: 0.4421 - val_accuracy: 0.8220 - val_f1_score: 0.3036 - val_loss: 0.3821
Epoch 32/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8065 - f1_score: 0.3321 - loss: 0.4404 - val_accuracy: 0.8220 - val_f1_score: 0.3036 - val_loss: 0.3784
Epoch 33/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.7923 - f1_score: 0.3485 - loss: 0.4552 - val_accuracy: 0.8240 - val_f1_score: 0.3036 - val_loss: 0.3737
Epoch 34/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.8154 - f1_score: 0.3353 - loss: 0.4344 - val_accuracy: 0.8220 - val_f1_score: 0.3036 - val_loss: 0.3787
Epoch 35/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8114 - f1_score: 0.3344 - loss: 0.4387 - val_accuracy: 0.8220 - val_f1_score: 0.3036 - val_loss: 0.3749
Epoch 36/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.8063 - f1_score: 0.3362 - loss: 0.4420 - val_accuracy: 0.8240 - val_f1_score: 0.3036 - val_loss: 0.3760
Epoch 37/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8054 - f1_score: 0.3418 - loss: 0.4447 - val_accuracy: 0.8220 - val_f1_score: 0.3036 - val_loss: 0.3729
Epoch 38/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8039 - f1_score: 0.3433 - loss: 0.4483 - val_accuracy: 0.8280 - val_f1_score: 0.3036 - val_loss: 0.3700
Epoch 39/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 6ms/step - accuracy: 0.8163 - f1_score: 0.3374 - loss: 0.4385 - val_accuracy: 0.8250 - val_f1_score: 0.3036 - val_loss: 0.3717
Epoch 40/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 6ms/step - accuracy: 0.8098 - f1_score: 0.3356 - loss: 0.4353 - val_accuracy: 0.8250 - val_f1_score: 0.3036 - val_loss: 0.3707
Epoch 41/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 6ms/step - accuracy: 0.8144 - f1_score: 0.3352 - loss: 0.4328 - val_accuracy: 0.8290 - val_f1_score: 0.3036 - val_loss: 0.3642
Epoch 42/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8152 - f1_score: 0.3334 - loss: 0.4346 - val_accuracy: 0.8270 - val_f1_score: 0.3036 - val_loss: 0.3648
Epoch 43/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8170 - f1_score: 0.3309 - loss: 0.4268 - val_accuracy: 0.8310 - val_f1_score: 0.3036 - val_loss: 0.3632
Epoch 44/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.7943 - f1_score: 0.3492 - loss: 0.4538 - val_accuracy: 0.8290 - val_f1_score: 0.3036 - val_loss: 0.3637
Epoch 45/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8099 - f1_score: 0.3439 - loss: 0.4421 - val_accuracy: 0.8430 - val_f1_score: 0.3036 - val_loss: 0.3598
Epoch 46/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.8105 - f1_score: 0.3401 - loss: 0.4329 - val_accuracy: 0.8500 - val_f1_score: 0.3036 - val_loss: 0.3561
Epoch 47/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8139 - f1_score: 0.3413 - loss: 0.4277 - val_accuracy: 0.8420 - val_f1_score: 0.3036 - val_loss: 0.3566
Epoch 48/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.8155 - f1_score: 0.3436 - loss: 0.4313 - val_accuracy: 0.8500 - val_f1_score: 0.3036 - val_loss: 0.3544
Epoch 49/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.8092 - f1_score: 0.3419 - loss: 0.4424 - val_accuracy: 0.8390 - val_f1_score: 0.3036 - val_loss: 0.3586
Epoch 50/50
125/125 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8188 - f1_score: 0.3291 - loss: 0.4244 - val_accuracy: 0.8510 - val_f1_score: 0.3036 - val_loss: 0.3530
In [52]:
print("Time taken in seconds ",end-start)
Time taken in seconds  35.90764355659485
In [53]:
plot(history,'loss')
In [54]:
plot(history,'accuracy')
In [55]:
model_2_train_perf = model_performance_classification(model_2, X_train_normalized, Y_train)
model_2_train_perf
250/250 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
Out[55]:
Accuracy Recall Precision F1 Score
0 0.824875 0.824875 0.84161 0.770177
In [56]:
model_2_val_perf = model_performance_classification(model_2, X_val_normalized, Y_val)
model_2_val_perf
32/32 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step
Out[56]:
Accuracy Recall Precision F1 Score
0 0.851 0.851 0.864536 0.806329

Comments on Model Performance

  • Validation accuracy is consistently over traning accuracy which could be an issue.
  • Validation loss is consistently below training loss which seems unusual
  • However, model seems to converge well and genaralise well but indicate some data leakage or other possible issue
  • Validation F1-score remained constant at .3036 across epochs showing model is failing to identify true positive and reducing false negatives
  • The model shows strong performance with high accuracy and recall, which is important for identifying customers likely to churn. However, there scope to improve precision to reduce false positives

Neural Network with Balanced Data (by applying SMOTE) and SGD Optimizer-M3

In [57]:
# clears the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
In [58]:
# To oversample and undersample data
from imblearn.over_sampling import SMOTE
In [59]:
print("Before Oversampling, counts of label 'Existing Customer': {}".format(sum(Y_train == 1)))
print("Before Oversampling, counts of label 'Churned Customer': {} \n".format(sum(Y_train == 0)))
print("Before Oversampling, counts of label 'Existing Customer-val': {}".format(sum(Y_val == 1)))
print("Before Oversampling, counts of label 'Churned Customer-val': {} \n".format(sum(Y_val == 0)))
print("Before Oversampling, counts of label 'Existing Customer-test': {}".format(sum(Y_test == 1)))
print("Before Oversampling, counts of label 'Churned Customer-test': {} \n".format(sum(Y_test == 0)))
Before Oversampling, counts of label 'Existing Customer': 1622
Before Oversampling, counts of label 'Churned Customer': 6378 

Before Oversampling, counts of label 'Existing Customer-val': 179
Before Oversampling, counts of label 'Churned Customer-val': 821 

Before Oversampling, counts of label 'Existing Customer-test': 236
Before Oversampling, counts of label 'Churned Customer-test': 764 

In [60]:
# Synthetic Minority Over Sampling Technique
sm = SMOTE(sampling_strategy=1, k_neighbors=5, random_state=1)
X_train_over, y_train_over = sm.fit_resample(X_train, Y_train)
In [61]:
print("After Oversampling, counts of label 'Attrited Customer': {}".format(sum(y_train_over == 1)))
print("After Oversampling, counts of label 'Existing Customer': {} \n".format(sum(y_train_over == 0)))
After Oversampling, counts of label 'Attrited Customer': 6378
After Oversampling, counts of label 'Existing Customer': 6378 

In [62]:
print("After Oversampling, the shape of train_X: {}".format(X_train_over.shape))
print("After Oversampling, the shape of train_y: {} \n".format(y_train_over.shape))
After Oversampling, the shape of train_X: (12756, 11)
After Oversampling, the shape of train_y: (12756,) 

In [63]:
# Normalize the features (do this AFTER SMOTE)
scaler = StandardScaler()
X_train_sm_normalized = scaler.fit_transform(X_train_over)
X_val_sm_normalized = scaler.transform(X_val)
X_test_sm_normalized = scaler.transform(X_test)
In [64]:
#Initializing the neural network
model_3= Sequential()

#Input layer and first hidden layer
model_3.add(Dense(14,activation="relu",input_dim=X_train_sm_normalized.shape[1]))

#Second hidden layer
model_3.add(Dense(14, activation="relu"))

#Third hidden layer
model_3.add(Dense(7,activation="relu"))

#Output layer
model_3.add(Dense(1,activation="sigmoid"))
In [65]:
optimizer = tf.keras.optimizers.SGD()    # defining SGD as the optimizer to be used
model_3.compile(loss='binary_crossentropy', optimizer=optimizer,metrics=['accuracy','f1_score'])
In [66]:
start = time.time()
history = model_3.fit(X_train_sm_normalized, y_train_over, validation_data=(X_val_sm_normalized,Y_val) , batch_size=batch_size, epochs=epochs)
end=time.time()
Epoch 1/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 2s 6ms/step - accuracy: 0.4514 - f1_score: 0.6692 - loss: 0.7396 - val_accuracy: 0.6290 - val_f1_score: 0.3036 - val_loss: 0.6560
Epoch 2/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.5437 - f1_score: 0.6694 - loss: 0.6848 - val_accuracy: 0.6720 - val_f1_score: 0.3036 - val_loss: 0.6443
Epoch 3/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 2s 10ms/step - accuracy: 0.6370 - f1_score: 0.6700 - loss: 0.6615 - val_accuracy: 0.7320 - val_f1_score: 0.3036 - val_loss: 0.6090
Epoch 4/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.6747 - f1_score: 0.6637 - loss: 0.6410 - val_accuracy: 0.7370 - val_f1_score: 0.3036 - val_loss: 0.5805
Epoch 5/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7046 - f1_score: 0.6709 - loss: 0.6152 - val_accuracy: 0.7490 - val_f1_score: 0.3036 - val_loss: 0.5442
Epoch 6/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7163 - f1_score: 0.6623 - loss: 0.5899 - val_accuracy: 0.7360 - val_f1_score: 0.3036 - val_loss: 0.5328
Epoch 7/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7280 - f1_score: 0.6685 - loss: 0.5708 - val_accuracy: 0.7400 - val_f1_score: 0.3036 - val_loss: 0.5216
Epoch 8/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7308 - f1_score: 0.6651 - loss: 0.5587 - val_accuracy: 0.7320 - val_f1_score: 0.3036 - val_loss: 0.5279
Epoch 9/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7383 - f1_score: 0.6621 - loss: 0.5428 - val_accuracy: 0.7370 - val_f1_score: 0.3036 - val_loss: 0.5212
Epoch 10/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7424 - f1_score: 0.6669 - loss: 0.5365 - val_accuracy: 0.7460 - val_f1_score: 0.3036 - val_loss: 0.5073
Epoch 11/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7506 - f1_score: 0.6673 - loss: 0.5288 - val_accuracy: 0.7510 - val_f1_score: 0.3036 - val_loss: 0.5007
Epoch 12/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7448 - f1_score: 0.6675 - loss: 0.5285 - val_accuracy: 0.7540 - val_f1_score: 0.3036 - val_loss: 0.5016
Epoch 13/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7545 - f1_score: 0.6647 - loss: 0.5203 - val_accuracy: 0.7590 - val_f1_score: 0.3036 - val_loss: 0.4946
Epoch 14/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7583 - f1_score: 0.6695 - loss: 0.5067 - val_accuracy: 0.7640 - val_f1_score: 0.3036 - val_loss: 0.4876
Epoch 15/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7595 - f1_score: 0.6682 - loss: 0.5070 - val_accuracy: 0.7640 - val_f1_score: 0.3036 - val_loss: 0.4880
Epoch 16/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7700 - f1_score: 0.6717 - loss: 0.4988 - val_accuracy: 0.7670 - val_f1_score: 0.3036 - val_loss: 0.4879
Epoch 17/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7675 - f1_score: 0.6659 - loss: 0.4929 - val_accuracy: 0.7750 - val_f1_score: 0.3036 - val_loss: 0.4769
Epoch 18/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7758 - f1_score: 0.6675 - loss: 0.4825 - val_accuracy: 0.7840 - val_f1_score: 0.3036 - val_loss: 0.4596
Epoch 19/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7812 - f1_score: 0.6663 - loss: 0.4725 - val_accuracy: 0.7800 - val_f1_score: 0.3036 - val_loss: 0.4657
Epoch 20/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7832 - f1_score: 0.6679 - loss: 0.4698 - val_accuracy: 0.7760 - val_f1_score: 0.3036 - val_loss: 0.4632
Epoch 21/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - accuracy: 0.7877 - f1_score: 0.6670 - loss: 0.4616 - val_accuracy: 0.7900 - val_f1_score: 0.3036 - val_loss: 0.4496
Epoch 22/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.7951 - f1_score: 0.6734 - loss: 0.4523 - val_accuracy: 0.7950 - val_f1_score: 0.3036 - val_loss: 0.4429
Epoch 23/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - accuracy: 0.8000 - f1_score: 0.6689 - loss: 0.4421 - val_accuracy: 0.8060 - val_f1_score: 0.3036 - val_loss: 0.4337
Epoch 24/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8068 - f1_score: 0.6670 - loss: 0.4346 - val_accuracy: 0.7950 - val_f1_score: 0.3036 - val_loss: 0.4398
Epoch 25/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8015 - f1_score: 0.6632 - loss: 0.4383 - val_accuracy: 0.7960 - val_f1_score: 0.3036 - val_loss: 0.4398
Epoch 26/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8040 - f1_score: 0.6705 - loss: 0.4304 - val_accuracy: 0.8070 - val_f1_score: 0.3036 - val_loss: 0.4155
Epoch 27/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8094 - f1_score: 0.6687 - loss: 0.4262 - val_accuracy: 0.8020 - val_f1_score: 0.3036 - val_loss: 0.4316
Epoch 28/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8072 - f1_score: 0.6586 - loss: 0.4157 - val_accuracy: 0.8000 - val_f1_score: 0.3036 - val_loss: 0.4368
Epoch 29/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8070 - f1_score: 0.6684 - loss: 0.4247 - val_accuracy: 0.8100 - val_f1_score: 0.3036 - val_loss: 0.4182
Epoch 30/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8172 - f1_score: 0.6699 - loss: 0.4105 - val_accuracy: 0.8170 - val_f1_score: 0.3036 - val_loss: 0.4126
Epoch 31/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8184 - f1_score: 0.6773 - loss: 0.4053 - val_accuracy: 0.8290 - val_f1_score: 0.3036 - val_loss: 0.3876
Epoch 32/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8123 - f1_score: 0.6661 - loss: 0.4076 - val_accuracy: 0.8120 - val_f1_score: 0.3036 - val_loss: 0.4117
Epoch 33/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8202 - f1_score: 0.6641 - loss: 0.4004 - val_accuracy: 0.7950 - val_f1_score: 0.3036 - val_loss: 0.4353
Epoch 34/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8184 - f1_score: 0.6623 - loss: 0.4008 - val_accuracy: 0.8240 - val_f1_score: 0.3036 - val_loss: 0.3948
Epoch 35/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8214 - f1_score: 0.6641 - loss: 0.3961 - val_accuracy: 0.8230 - val_f1_score: 0.3036 - val_loss: 0.3945
Epoch 36/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8250 - f1_score: 0.6674 - loss: 0.3889 - val_accuracy: 0.8210 - val_f1_score: 0.3036 - val_loss: 0.4056
Epoch 37/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8239 - f1_score: 0.6693 - loss: 0.3890 - val_accuracy: 0.8340 - val_f1_score: 0.3036 - val_loss: 0.3863
Epoch 38/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8172 - f1_score: 0.6670 - loss: 0.3905 - val_accuracy: 0.8210 - val_f1_score: 0.3036 - val_loss: 0.3955
Epoch 39/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8225 - f1_score: 0.6648 - loss: 0.3923 - val_accuracy: 0.8090 - val_f1_score: 0.3036 - val_loss: 0.4150
Epoch 40/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - accuracy: 0.8317 - f1_score: 0.6652 - loss: 0.3825 - val_accuracy: 0.8150 - val_f1_score: 0.3036 - val_loss: 0.4129
Epoch 41/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - accuracy: 0.8191 - f1_score: 0.6692 - loss: 0.3903 - val_accuracy: 0.8290 - val_f1_score: 0.3036 - val_loss: 0.3825
Epoch 42/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.8252 - f1_score: 0.6634 - loss: 0.3901 - val_accuracy: 0.8210 - val_f1_score: 0.3036 - val_loss: 0.4095
Epoch 43/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8285 - f1_score: 0.6735 - loss: 0.3791 - val_accuracy: 0.8040 - val_f1_score: 0.3036 - val_loss: 0.4342
Epoch 44/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8316 - f1_score: 0.6682 - loss: 0.3740 - val_accuracy: 0.8380 - val_f1_score: 0.3036 - val_loss: 0.3821
Epoch 45/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8291 - f1_score: 0.6651 - loss: 0.3816 - val_accuracy: 0.8430 - val_f1_score: 0.3036 - val_loss: 0.3727
Epoch 46/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8324 - f1_score: 0.6701 - loss: 0.3722 - val_accuracy: 0.8300 - val_f1_score: 0.3036 - val_loss: 0.3964
Epoch 47/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8316 - f1_score: 0.6709 - loss: 0.3718 - val_accuracy: 0.8420 - val_f1_score: 0.3036 - val_loss: 0.3764
Epoch 48/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8316 - f1_score: 0.6693 - loss: 0.3719 - val_accuracy: 0.8290 - val_f1_score: 0.3036 - val_loss: 0.3889
Epoch 49/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8345 - f1_score: 0.6713 - loss: 0.3714 - val_accuracy: 0.8410 - val_f1_score: 0.3036 - val_loss: 0.3744
Epoch 50/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8383 - f1_score: 0.6671 - loss: 0.3661 - val_accuracy: 0.8400 - val_f1_score: 0.3036 - val_loss: 0.3788
In [67]:
print("Time taken in seconds ",end-start)
Time taken in seconds  39.39441275596619
In [68]:
plot(history,'loss')
In [69]:
plot(history,'accuracy')
In [70]:
model_3_train_perf = model_performance_classification(model_3, X_train_sm_normalized, y_train_over)
model_3_train_perf
399/399 ━━━━━━━━━━━━━━━━━━━━ 1s 1ms/step
Out[70]:
Accuracy Recall Precision F1 Score
0 0.837802 0.837802 0.839266 0.837627
In [71]:
model_3_val_perf = model_performance_classification(model_3, X_val_sm_normalized, Y_val)
model_3_val_perf
32/32 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step 
Out[71]:
Accuracy Recall Precision F1 Score
0 0.84 0.84 0.850385 0.844415

Comments on Model Performance

  • Training accuracy consistently increased over epochs showing model is learning weel from data.Validation accouracy shows good genaralisation *Training and Validation loss decrease over time
  • However, model seems to converge well and genaralise well but indicate some data leakage or other possible issue
  • The training F1 score is consistently around 66.34%, whereas the validation F1 score remains constant at 38.36%. The low validation F1 score compared to the training F1 score indicates that the model might not be performing as well in identifying true positives on the validation set. *Model is genalising and performing well, both recall and precision are close on training and validation sets.

Neural Network with Balanced Data (by applying SMOTE) and Adam Optimizer-M4

In [72]:
# clears the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
In [73]:
#Initializing the neural network
model_4 = Sequential()

# Input layer and the first hidden layer
model_4.add(Dense(14,activation="relu",input_dim = X_train_sm_normalized.shape[1]))

# Second hidden layer
model_4.add(Dense(14, activation="relu"))

# third hidden layer
model_4.add(Dense(7, activation="relu"))

# output layer
model_4.add(Dense(1, activation="sigmoid"))
In [74]:
# Compile the model with Adam optimizer and a specified learning rate
#adam = Adam(learning_rate=0.001)
optimizer=keras.optimizers.Adam(learning_rate=0.001)

model_4.compile(optimizer=optimizer, loss='binary_crossentropy', metrics=['accuracy','f1_score'])
In [75]:
start = time.time()
history = model_4.fit(X_train_sm_normalized, y_train_over, validation_data=(X_val_sm_normalized,Y_val) , batch_size=batch_size, epochs=epochs)
end=time.time()
Epoch 1/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 3s 7ms/step - accuracy: 0.5746 - f1_score: 0.6642 - loss: 0.6702 - val_accuracy: 0.7430 - val_f1_score: 0.3036 - val_loss: 0.5531
Epoch 2/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 2s 5ms/step - accuracy: 0.7270 - f1_score: 0.6728 - loss: 0.5582 - val_accuracy: 0.7560 - val_f1_score: 0.3036 - val_loss: 0.5016
Epoch 3/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7515 - f1_score: 0.6636 - loss: 0.5140 - val_accuracy: 0.7400 - val_f1_score: 0.3036 - val_loss: 0.5085
Epoch 4/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7617 - f1_score: 0.6661 - loss: 0.4836 - val_accuracy: 0.7770 - val_f1_score: 0.3036 - val_loss: 0.4606
Epoch 5/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7754 - f1_score: 0.6700 - loss: 0.4682 - val_accuracy: 0.8000 - val_f1_score: 0.3036 - val_loss: 0.4281
Epoch 6/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7906 - f1_score: 0.6683 - loss: 0.4526 - val_accuracy: 0.7880 - val_f1_score: 0.3036 - val_loss: 0.4286
Epoch 7/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7876 - f1_score: 0.6701 - loss: 0.4464 - val_accuracy: 0.8080 - val_f1_score: 0.3036 - val_loss: 0.4148
Epoch 8/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8024 - f1_score: 0.6584 - loss: 0.4329 - val_accuracy: 0.8060 - val_f1_score: 0.3036 - val_loss: 0.4311
Epoch 9/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8113 - f1_score: 0.6752 - loss: 0.4178 - val_accuracy: 0.8390 - val_f1_score: 0.3036 - val_loss: 0.3735
Epoch 10/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8106 - f1_score: 0.6653 - loss: 0.4127 - val_accuracy: 0.8190 - val_f1_score: 0.3036 - val_loss: 0.4067
Epoch 11/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8192 - f1_score: 0.6622 - loss: 0.4043 - val_accuracy: 0.8310 - val_f1_score: 0.3036 - val_loss: 0.3881
Epoch 12/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8234 - f1_score: 0.6609 - loss: 0.3946 - val_accuracy: 0.8180 - val_f1_score: 0.3036 - val_loss: 0.4120
Epoch 13/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8280 - f1_score: 0.6651 - loss: 0.3896 - val_accuracy: 0.8280 - val_f1_score: 0.3036 - val_loss: 0.3946
Epoch 14/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8291 - f1_score: 0.6635 - loss: 0.3810 - val_accuracy: 0.8080 - val_f1_score: 0.3036 - val_loss: 0.4200
Epoch 15/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.8341 - f1_score: 0.6716 - loss: 0.3733 - val_accuracy: 0.8320 - val_f1_score: 0.3036 - val_loss: 0.3847
Epoch 16/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - accuracy: 0.8300 - f1_score: 0.6730 - loss: 0.3782 - val_accuracy: 0.8500 - val_f1_score: 0.3036 - val_loss: 0.3652
Epoch 17/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 6ms/step - accuracy: 0.8395 - f1_score: 0.6723 - loss: 0.3630 - val_accuracy: 0.8470 - val_f1_score: 0.3036 - val_loss: 0.3603
Epoch 18/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - accuracy: 0.8468 - f1_score: 0.6673 - loss: 0.3605 - val_accuracy: 0.8390 - val_f1_score: 0.3036 - val_loss: 0.3803
Epoch 19/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8491 - f1_score: 0.6681 - loss: 0.3560 - val_accuracy: 0.8500 - val_f1_score: 0.3036 - val_loss: 0.3635
Epoch 20/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8480 - f1_score: 0.6615 - loss: 0.3510 - val_accuracy: 0.8360 - val_f1_score: 0.3036 - val_loss: 0.3837
Epoch 21/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8528 - f1_score: 0.6666 - loss: 0.3423 - val_accuracy: 0.8430 - val_f1_score: 0.3036 - val_loss: 0.3688
Epoch 22/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8538 - f1_score: 0.6683 - loss: 0.3382 - val_accuracy: 0.8440 - val_f1_score: 0.3036 - val_loss: 0.3634
Epoch 23/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8546 - f1_score: 0.6632 - loss: 0.3374 - val_accuracy: 0.8460 - val_f1_score: 0.3036 - val_loss: 0.3535
Epoch 24/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8604 - f1_score: 0.6682 - loss: 0.3306 - val_accuracy: 0.8480 - val_f1_score: 0.3036 - val_loss: 0.3648
Epoch 25/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8609 - f1_score: 0.6668 - loss: 0.3293 - val_accuracy: 0.8200 - val_f1_score: 0.3036 - val_loss: 0.4002
Epoch 26/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8648 - f1_score: 0.6715 - loss: 0.3188 - val_accuracy: 0.8540 - val_f1_score: 0.3036 - val_loss: 0.3451
Epoch 27/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8605 - f1_score: 0.6710 - loss: 0.3263 - val_accuracy: 0.8500 - val_f1_score: 0.3036 - val_loss: 0.3518
Epoch 28/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8645 - f1_score: 0.6753 - loss: 0.3191 - val_accuracy: 0.8640 - val_f1_score: 0.3036 - val_loss: 0.3381
Epoch 29/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.8675 - f1_score: 0.6646 - loss: 0.3105 - val_accuracy: 0.8280 - val_f1_score: 0.3036 - val_loss: 0.3864
Epoch 30/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 2s 5ms/step - accuracy: 0.8691 - f1_score: 0.6728 - loss: 0.3081 - val_accuracy: 0.8410 - val_f1_score: 0.3036 - val_loss: 0.3600
Epoch 31/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - accuracy: 0.8656 - f1_score: 0.6662 - loss: 0.3092 - val_accuracy: 0.8420 - val_f1_score: 0.3036 - val_loss: 0.3651
Epoch 32/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - accuracy: 0.8671 - f1_score: 0.6689 - loss: 0.3096 - val_accuracy: 0.8470 - val_f1_score: 0.3036 - val_loss: 0.3566
Epoch 33/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8704 - f1_score: 0.6657 - loss: 0.3052 - val_accuracy: 0.8530 - val_f1_score: 0.3036 - val_loss: 0.3523
Epoch 34/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8715 - f1_score: 0.6680 - loss: 0.3003 - val_accuracy: 0.8500 - val_f1_score: 0.3036 - val_loss: 0.3558
Epoch 35/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8713 - f1_score: 0.6622 - loss: 0.2994 - val_accuracy: 0.8310 - val_f1_score: 0.3036 - val_loss: 0.3792
Epoch 36/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8734 - f1_score: 0.6690 - loss: 0.2951 - val_accuracy: 0.8520 - val_f1_score: 0.3036 - val_loss: 0.3489
Epoch 37/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8727 - f1_score: 0.6707 - loss: 0.2962 - val_accuracy: 0.8640 - val_f1_score: 0.3036 - val_loss: 0.3270
Epoch 38/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8733 - f1_score: 0.6620 - loss: 0.2979 - val_accuracy: 0.8200 - val_f1_score: 0.3036 - val_loss: 0.3908
Epoch 39/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8734 - f1_score: 0.6606 - loss: 0.2957 - val_accuracy: 0.8470 - val_f1_score: 0.3036 - val_loss: 0.3534
Epoch 40/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8766 - f1_score: 0.6656 - loss: 0.2876 - val_accuracy: 0.8630 - val_f1_score: 0.3036 - val_loss: 0.3294
Epoch 41/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8785 - f1_score: 0.6590 - loss: 0.2915 - val_accuracy: 0.8560 - val_f1_score: 0.3036 - val_loss: 0.3417
Epoch 42/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8747 - f1_score: 0.6689 - loss: 0.2886 - val_accuracy: 0.8380 - val_f1_score: 0.3036 - val_loss: 0.3616
Epoch 43/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8775 - f1_score: 0.6581 - loss: 0.2861 - val_accuracy: 0.8530 - val_f1_score: 0.3036 - val_loss: 0.3464
Epoch 44/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8787 - f1_score: 0.6708 - loss: 0.2846 - val_accuracy: 0.8630 - val_f1_score: 0.3036 - val_loss: 0.3296
Epoch 45/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.8788 - f1_score: 0.6622 - loss: 0.2820 - val_accuracy: 0.8560 - val_f1_score: 0.3036 - val_loss: 0.3379
Epoch 46/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - accuracy: 0.8790 - f1_score: 0.6652 - loss: 0.2787 - val_accuracy: 0.8530 - val_f1_score: 0.3036 - val_loss: 0.3343
Epoch 47/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - accuracy: 0.8766 - f1_score: 0.6627 - loss: 0.2882 - val_accuracy: 0.8490 - val_f1_score: 0.3036 - val_loss: 0.3487
Epoch 48/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.8814 - f1_score: 0.6733 - loss: 0.2748 - val_accuracy: 0.8400 - val_f1_score: 0.3036 - val_loss: 0.3608
Epoch 49/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8757 - f1_score: 0.6612 - loss: 0.2819 - val_accuracy: 0.8500 - val_f1_score: 0.3036 - val_loss: 0.3446
Epoch 50/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8791 - f1_score: 0.6680 - loss: 0.2799 - val_accuracy: 0.8710 - val_f1_score: 0.3036 - val_loss: 0.3260
In [76]:
print("Time taken in seconds ",end-start)
Time taken in seconds  50.11167621612549
In [77]:
plot(history,'loss')
In [78]:
plot(history,'accuracy')
In [79]:
model_4_train_perf = model_performance_classification(model_4, X_train_sm_normalized, y_train_over)
model_4_train_perf
399/399 ━━━━━━━━━━━━━━━━━━━━ 1s 1ms/step
Out[79]:
Accuracy Recall Precision F1 Score
0 0.877705 0.877705 0.882444 0.877325
In [80]:
model_4_valid_perf = model_performance_classification(model_4, X_val_sm_normalized, Y_val)
model_4_valid_perf
32/32 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step 
Out[80]:
Accuracy Recall Precision F1 Score
0 0.871 0.871 0.865257 0.867463

Comments on Model Performance

  • Training Accuracy shows good performncae,validation accuracy shows good genalisations but some fluctuations which stablisizes well.
  • Both training and validation losses are decreasing, indicating that the model is learning and not overfitting.
  • Both training and validation accuracy are increasing, suggesting the model is improving its performance on both seen and unseen data.
  • The training F1 score is consistently around 66.34%, whereas the validation F1 score remains constant at 38.36%. The low validation F1 score compared to the training F1 score indicates that the model might not be performing as well in identifying true positives on the validation set. *The model is performing well, with high and consistent metrics across both training and validation sets.It shows high accuracy, balanced precision and recall and consistent f1 scores

Neural Network with Balanced Data (by applying SMOTE), Adam Optimizer, and Dropout-M5

In [81]:
# clears the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
In [82]:
#Initializing the neural network
model_5 = Sequential()

# Input layer and the first hidden layer
model_5.add(Dense(14,activation="relu",input_dim = X_train_sm_normalized.shape[1]))
model_5.add(Dropout(0.5))

# Second hidden layer
model_5.add(Dense(14, activation="relu"))
model_5.add(Dropout(0.5))

# third hidden layer
model_5.add(Dense(7, activation="relu"))
model_5.add(Dropout(0.5))

# output layer
model_5.add(Dense(1, activation="sigmoid"))
In [83]:
# Compile the model with Adam optimizer and a specified learning rate
#adam = Adam(learning_rate=0.001)
optimizer=keras.optimizers.Adam(learning_rate=0.001)

model_5.compile(optimizer=optimizer, loss='binary_crossentropy', metrics=['accuracy','f1_score'])
In [84]:
start = time.time()
history = model_5.fit(X_train_sm_normalized, y_train_over, validation_data=(X_val_sm_normalized,Y_val) , batch_size=batch_size, epochs=epochs)
end=time.time()
Epoch 1/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 4s 5ms/step - accuracy: 0.5071 - f1_score: 0.6673 - loss: 0.7240 - val_accuracy: 0.8170 - val_f1_score: 0.3036 - val_loss: 0.6682
Epoch 2/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.5390 - f1_score: 0.6718 - loss: 0.6906 - val_accuracy: 0.7930 - val_f1_score: 0.3036 - val_loss: 0.6638
Epoch 3/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.5547 - f1_score: 0.6658 - loss: 0.6813 - val_accuracy: 0.7870 - val_f1_score: 0.3036 - val_loss: 0.6481
Epoch 4/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.5883 - f1_score: 0.6650 - loss: 0.6735 - val_accuracy: 0.7800 - val_f1_score: 0.3036 - val_loss: 0.6263
Epoch 5/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.5899 - f1_score: 0.6707 - loss: 0.6651 - val_accuracy: 0.7760 - val_f1_score: 0.3036 - val_loss: 0.6094
Epoch 6/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.6235 - f1_score: 0.6683 - loss: 0.6531 - val_accuracy: 0.7920 - val_f1_score: 0.3036 - val_loss: 0.6025
Epoch 7/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.6336 - f1_score: 0.6791 - loss: 0.6431 - val_accuracy: 0.7920 - val_f1_score: 0.3036 - val_loss: 0.5787
Epoch 8/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.6374 - f1_score: 0.6725 - loss: 0.6473 - val_accuracy: 0.7710 - val_f1_score: 0.3036 - val_loss: 0.5800
Epoch 9/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - accuracy: 0.6517 - f1_score: 0.6657 - loss: 0.6334 - val_accuracy: 0.7700 - val_f1_score: 0.3036 - val_loss: 0.5633
Epoch 10/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.6644 - f1_score: 0.6656 - loss: 0.6269 - val_accuracy: 0.7650 - val_f1_score: 0.3036 - val_loss: 0.5644
Epoch 11/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 2s 5ms/step - accuracy: 0.6714 - f1_score: 0.6611 - loss: 0.6302 - val_accuracy: 0.7630 - val_f1_score: 0.3036 - val_loss: 0.5648
Epoch 12/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 6ms/step - accuracy: 0.6737 - f1_score: 0.6704 - loss: 0.6177 - val_accuracy: 0.7670 - val_f1_score: 0.3036 - val_loss: 0.5466
Epoch 13/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 6ms/step - accuracy: 0.6774 - f1_score: 0.6675 - loss: 0.6180 - val_accuracy: 0.7740 - val_f1_score: 0.3036 - val_loss: 0.5407
Epoch 14/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.6966 - f1_score: 0.6632 - loss: 0.6044 - val_accuracy: 0.7680 - val_f1_score: 0.3036 - val_loss: 0.5348
Epoch 15/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.6772 - f1_score: 0.6709 - loss: 0.6130 - val_accuracy: 0.7670 - val_f1_score: 0.3036 - val_loss: 0.5341
Epoch 16/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.6873 - f1_score: 0.6673 - loss: 0.6075 - val_accuracy: 0.7790 - val_f1_score: 0.3036 - val_loss: 0.5295
Epoch 17/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.6861 - f1_score: 0.6656 - loss: 0.6105 - val_accuracy: 0.7770 - val_f1_score: 0.3036 - val_loss: 0.5242
Epoch 18/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.6887 - f1_score: 0.6664 - loss: 0.6101 - val_accuracy: 0.7880 - val_f1_score: 0.3036 - val_loss: 0.5163
Epoch 19/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.7044 - f1_score: 0.6649 - loss: 0.5927 - val_accuracy: 0.7760 - val_f1_score: 0.3036 - val_loss: 0.5213
Epoch 20/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.6943 - f1_score: 0.6645 - loss: 0.5974 - val_accuracy: 0.7860 - val_f1_score: 0.3036 - val_loss: 0.5160
Epoch 21/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.6988 - f1_score: 0.6708 - loss: 0.5954 - val_accuracy: 0.7880 - val_f1_score: 0.3036 - val_loss: 0.5030
Epoch 22/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7066 - f1_score: 0.6687 - loss: 0.5944 - val_accuracy: 0.7980 - val_f1_score: 0.3036 - val_loss: 0.4932
Epoch 23/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 2s 6ms/step - accuracy: 0.7070 - f1_score: 0.6668 - loss: 0.5866 - val_accuracy: 0.7810 - val_f1_score: 0.3036 - val_loss: 0.4971
Epoch 24/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - accuracy: 0.7176 - f1_score: 0.6634 - loss: 0.5860 - val_accuracy: 0.7870 - val_f1_score: 0.3036 - val_loss: 0.5011
Epoch 25/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 6ms/step - accuracy: 0.7133 - f1_score: 0.6659 - loss: 0.5817 - val_accuracy: 0.8000 - val_f1_score: 0.3036 - val_loss: 0.4933
Epoch 26/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.7089 - f1_score: 0.6661 - loss: 0.5877 - val_accuracy: 0.8040 - val_f1_score: 0.3036 - val_loss: 0.4901
Epoch 27/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7185 - f1_score: 0.6635 - loss: 0.5768 - val_accuracy: 0.7970 - val_f1_score: 0.3036 - val_loss: 0.4984
Epoch 28/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7189 - f1_score: 0.6607 - loss: 0.5776 - val_accuracy: 0.8130 - val_f1_score: 0.3036 - val_loss: 0.4809
Epoch 29/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7153 - f1_score: 0.6678 - loss: 0.5763 - val_accuracy: 0.8130 - val_f1_score: 0.3036 - val_loss: 0.4793
Epoch 30/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7227 - f1_score: 0.6646 - loss: 0.5760 - val_accuracy: 0.8160 - val_f1_score: 0.3036 - val_loss: 0.4701
Epoch 31/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7239 - f1_score: 0.6649 - loss: 0.5661 - val_accuracy: 0.8160 - val_f1_score: 0.3036 - val_loss: 0.4684
Epoch 32/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7236 - f1_score: 0.6756 - loss: 0.5638 - val_accuracy: 0.8190 - val_f1_score: 0.3036 - val_loss: 0.4717
Epoch 33/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.7269 - f1_score: 0.6667 - loss: 0.5645 - val_accuracy: 0.8150 - val_f1_score: 0.3036 - val_loss: 0.4687
Epoch 34/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.7285 - f1_score: 0.6668 - loss: 0.5702 - val_accuracy: 0.8180 - val_f1_score: 0.3036 - val_loss: 0.4634
Epoch 35/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7222 - f1_score: 0.6725 - loss: 0.5677 - val_accuracy: 0.8120 - val_f1_score: 0.3036 - val_loss: 0.4708
Epoch 36/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 2s 5ms/step - accuracy: 0.7233 - f1_score: 0.6703 - loss: 0.5710 - val_accuracy: 0.8110 - val_f1_score: 0.3036 - val_loss: 0.4666
Epoch 37/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - accuracy: 0.7328 - f1_score: 0.6682 - loss: 0.5659 - val_accuracy: 0.8110 - val_f1_score: 0.3036 - val_loss: 0.4595
Epoch 38/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 6ms/step - accuracy: 0.7271 - f1_score: 0.6688 - loss: 0.5668 - val_accuracy: 0.8210 - val_f1_score: 0.3036 - val_loss: 0.4563
Epoch 39/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 6ms/step - accuracy: 0.7232 - f1_score: 0.6630 - loss: 0.5748 - val_accuracy: 0.8100 - val_f1_score: 0.3036 - val_loss: 0.4697
Epoch 40/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7241 - f1_score: 0.6654 - loss: 0.5664 - val_accuracy: 0.8090 - val_f1_score: 0.3036 - val_loss: 0.4681
Epoch 41/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7238 - f1_score: 0.6684 - loss: 0.5644 - val_accuracy: 0.8210 - val_f1_score: 0.3036 - val_loss: 0.4645
Epoch 42/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7180 - f1_score: 0.6643 - loss: 0.5626 - val_accuracy: 0.8230 - val_f1_score: 0.3036 - val_loss: 0.4515
Epoch 43/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7297 - f1_score: 0.6615 - loss: 0.5597 - val_accuracy: 0.8300 - val_f1_score: 0.3036 - val_loss: 0.4558
Epoch 44/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7348 - f1_score: 0.6636 - loss: 0.5564 - val_accuracy: 0.8250 - val_f1_score: 0.3036 - val_loss: 0.4577
Epoch 45/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7209 - f1_score: 0.6709 - loss: 0.5671 - val_accuracy: 0.8280 - val_f1_score: 0.3036 - val_loss: 0.4623
Epoch 46/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7271 - f1_score: 0.6665 - loss: 0.5658 - val_accuracy: 0.8280 - val_f1_score: 0.3036 - val_loss: 0.4598
Epoch 47/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7261 - f1_score: 0.6669 - loss: 0.5661 - val_accuracy: 0.8210 - val_f1_score: 0.3036 - val_loss: 0.4585
Epoch 48/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.7233 - f1_score: 0.6597 - loss: 0.5659 - val_accuracy: 0.8140 - val_f1_score: 0.3036 - val_loss: 0.4672
Epoch 49/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 2s 7ms/step - accuracy: 0.7305 - f1_score: 0.6689 - loss: 0.5610 - val_accuracy: 0.8220 - val_f1_score: 0.3036 - val_loss: 0.4601
Epoch 50/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 3s 12ms/step - accuracy: 0.7278 - f1_score: 0.6680 - loss: 0.5575 - val_accuracy: 0.8240 - val_f1_score: 0.3036 - val_loss: 0.4613
In [85]:
print("Time taken in seconds ",end-start)
Time taken in seconds  61.24990773200989
In [86]:
plot(history,'loss')
In [87]:
plot(history,'accuracy')
In [88]:
model_5_train_perf = model_performance_classification(model_5, X_train_sm_normalized, y_train_over)
model_5_train_perf
399/399 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step
Out[88]:
Accuracy Recall Precision F1 Score
0 0.793117 0.793117 0.795435 0.79271
In [89]:
model_5_valid_perf = model_performance_classification(model_5, X_val_sm_normalized, Y_val)
model_5_valid_perf
32/32 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step
Out[89]:
Accuracy Recall Precision F1 Score
0 0.824 0.824 0.860278 0.835979

Comments on Model Performance

*We see decrease in both training and bvalidation loss over epochs and increase in training and validation accuracy over times

  • Model is not overfitting , there are minor fluctuations
  • Model shows good genarisations,consistent f1 scores (suggesting stable balance between recall and precision)

Model Performance Comparison and Final Model Selection

In [90]:
# training performance comparison

models_train_comp_df = pd.concat(
    [
        model_0_train_perf.T,
        model_1_train_perf.T,
        model_2_train_perf.T,
        model_3_train_perf.T,
        model_4_train_perf.T,
        model_5_train_perf.T,
      ],
    axis=1,
)
models_train_comp_df.columns = [
    "NN with SGD Optimizer-M0",
    "NN with Adam Optimizer-M1",
    "NNk with Adam Optimizer and Dropout-M2",
    "NN(SMOTE) and SGD Optimizer-M3",
    "NN( SMOTE) and Adam Optimizer-M4",
    "NN(SMOTE), Adam Optimizer, and Dropout-M5"
]
In [91]:
#Validation performance comparison

models_valid_comp_df = pd.concat(
    [
        model_0_val_perf.T,
        model_1_val_perf.T,
        model_2_val_perf.T,
        model_3_val_perf.T,
        model_4_valid_perf.T,
        model_5_valid_perf.T,

    ],
    axis=1,
)
models_valid_comp_df.columns = [
     "NN with SGD Optimizer-M0",
    "NN with Adam Optimizer-M1",
    "NNk with Adam Optimizer and Dropout-M2",
    "NN(SMOTE) and SGD Optimizer-M3",
    "NN( SMOTE) and Adam Optimizer-M4",
    "NN(SMOTE), Adam Optimizer, and Dropout-M5"
]
In [92]:
models_train_comp_df
Out[92]:
NN with SGD Optimizer-M0 NN with Adam Optimizer-M1 NNk with Adam Optimizer and Dropout-M2 NN(SMOTE) and SGD Optimizer-M3 NN( SMOTE) and Adam Optimizer-M4 NN(SMOTE), Adam Optimizer, and Dropout-M5
Accuracy 0.854500 0.865000 0.824875 0.837802 0.877705 0.793117
Recall 0.854500 0.865000 0.824875 0.837802 0.877705 0.793117
Precision 0.844325 0.856189 0.841610 0.839266 0.882444 0.795435
F1 Score 0.837570 0.853499 0.770177 0.837627 0.877325 0.792710
In [93]:
models_valid_comp_df
Out[93]:
NN with SGD Optimizer-M0 NN with Adam Optimizer-M1 NNk with Adam Optimizer and Dropout-M2 NN(SMOTE) and SGD Optimizer-M3 NN( SMOTE) and Adam Optimizer-M4 NN(SMOTE), Adam Optimizer, and Dropout-M5
Accuracy 0.880000 0.886000 0.851000 0.840000 0.871000 0.824000
Recall 0.880000 0.886000 0.851000 0.840000 0.871000 0.824000
Precision 0.872717 0.878312 0.864536 0.850385 0.865257 0.860278
F1 Score 0.865521 0.876533 0.806329 0.844415 0.867463 0.835979

Final Model Selection

  • On comparing all the model performances listed above , Adam Optimizer-M4 NN(SMOTE) is the best model and will be the Final model for selction .

Adam Optimizer-M4 NN(SMOTE)

Training Metrics: Accuracy: 87.70%, Recall: 87.7%, Precision: 88.24%, F1 Score: 87.73%

Validation Metrics: Accuracy: 87.1.00%, Recall: 87.70%, Precision: 86.52%, F1 Score: 86.74%

Reasons:

  • Highest accuracy and F1 Score(indicating good overall performance and balance between precision and recall.)
  • Validation set generasing best showing no signs of overfitting

Final Model

In [95]:
# clears the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
In [96]:
#Initializing the neural network
model_test = Sequential()

# Input layer and the first hidden layer
model_test.add(Dense(14,activation="relu",input_dim = X_test_normalized.shape[1]))

# Second hidden layer
model_test.add(Dense(14, activation="relu"))

# third hidden layer
model_test.add(Dense(7, activation="relu"))

# output layer
model_test.add(Dense(1, activation="sigmoid"))
In [97]:
# Compile the model with Adam optimizer and a specified learning rate
#adam = Adam(learning_rate=0.001)
optimizer=keras.optimizers.Adam(learning_rate=0.001)

model_test.compile(optimizer=optimizer, loss='binary_crossentropy', metrics=['accuracy','f1_score'])
In [98]:
start = time.time()
history = model_test.fit(X_train_sm_normalized, y_train_over, validation_data=(X_test_sm_normalized,Y_test) , batch_size=batch_size, epochs=epochs)
end=time.time()
Epoch 1/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 3s 5ms/step - accuracy: 0.5306 - f1_score: 0.6680 - loss: 0.6977 - val_accuracy: 0.7160 - val_f1_score: 0.3819 - val_loss: 0.5838
Epoch 2/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.6997 - f1_score: 0.6647 - loss: 0.5996 - val_accuracy: 0.7180 - val_f1_score: 0.3819 - val_loss: 0.5347
Epoch 3/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7302 - f1_score: 0.6713 - loss: 0.5420 - val_accuracy: 0.7330 - val_f1_score: 0.3819 - val_loss: 0.5023
Epoch 4/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7550 - f1_score: 0.6587 - loss: 0.5013 - val_accuracy: 0.7650 - val_f1_score: 0.3819 - val_loss: 0.4770
Epoch 5/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7804 - f1_score: 0.6661 - loss: 0.4719 - val_accuracy: 0.7810 - val_f1_score: 0.3819 - val_loss: 0.4601
Epoch 6/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 2s 5ms/step - accuracy: 0.8001 - f1_score: 0.6675 - loss: 0.4430 - val_accuracy: 0.7870 - val_f1_score: 0.3819 - val_loss: 0.4416
Epoch 7/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - accuracy: 0.7951 - f1_score: 0.6686 - loss: 0.4397 - val_accuracy: 0.7850 - val_f1_score: 0.3819 - val_loss: 0.4461
Epoch 8/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - accuracy: 0.8006 - f1_score: 0.6614 - loss: 0.4341 - val_accuracy: 0.7910 - val_f1_score: 0.3819 - val_loss: 0.4476
Epoch 9/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8106 - f1_score: 0.6647 - loss: 0.4169 - val_accuracy: 0.7920 - val_f1_score: 0.3819 - val_loss: 0.4437
Epoch 10/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8134 - f1_score: 0.6663 - loss: 0.4107 - val_accuracy: 0.7880 - val_f1_score: 0.3819 - val_loss: 0.4496
Epoch 11/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.8137 - f1_score: 0.6703 - loss: 0.4082 - val_accuracy: 0.7920 - val_f1_score: 0.3819 - val_loss: 0.4401
Epoch 12/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8231 - f1_score: 0.6688 - loss: 0.3955 - val_accuracy: 0.7930 - val_f1_score: 0.3819 - val_loss: 0.4369
Epoch 13/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8248 - f1_score: 0.6659 - loss: 0.3876 - val_accuracy: 0.7950 - val_f1_score: 0.3819 - val_loss: 0.4279
Epoch 14/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8280 - f1_score: 0.6669 - loss: 0.3866 - val_accuracy: 0.8100 - val_f1_score: 0.3819 - val_loss: 0.4219
Epoch 15/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8282 - f1_score: 0.6679 - loss: 0.3808 - val_accuracy: 0.8050 - val_f1_score: 0.3819 - val_loss: 0.4310
Epoch 16/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8375 - f1_score: 0.6673 - loss: 0.3689 - val_accuracy: 0.8050 - val_f1_score: 0.3819 - val_loss: 0.4196
Epoch 17/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8399 - f1_score: 0.6687 - loss: 0.3730 - val_accuracy: 0.8030 - val_f1_score: 0.3819 - val_loss: 0.4281
Epoch 18/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8353 - f1_score: 0.6685 - loss: 0.3699 - val_accuracy: 0.8110 - val_f1_score: 0.3819 - val_loss: 0.4206
Epoch 19/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8432 - f1_score: 0.6670 - loss: 0.3573 - val_accuracy: 0.8240 - val_f1_score: 0.3819 - val_loss: 0.4067
Epoch 20/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - accuracy: 0.8459 - f1_score: 0.6661 - loss: 0.3498 - val_accuracy: 0.8170 - val_f1_score: 0.3819 - val_loss: 0.4114
Epoch 21/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - accuracy: 0.8447 - f1_score: 0.6682 - loss: 0.3562 - val_accuracy: 0.8280 - val_f1_score: 0.3819 - val_loss: 0.3943
Epoch 22/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - accuracy: 0.8526 - f1_score: 0.6637 - loss: 0.3376 - val_accuracy: 0.8140 - val_f1_score: 0.3819 - val_loss: 0.4135
Epoch 23/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 2s 6ms/step - accuracy: 0.8545 - f1_score: 0.6631 - loss: 0.3389 - val_accuracy: 0.8220 - val_f1_score: 0.3819 - val_loss: 0.3994
Epoch 24/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - accuracy: 0.8500 - f1_score: 0.6669 - loss: 0.3370 - val_accuracy: 0.8090 - val_f1_score: 0.3819 - val_loss: 0.4047
Epoch 25/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8579 - f1_score: 0.6697 - loss: 0.3337 - val_accuracy: 0.8240 - val_f1_score: 0.3819 - val_loss: 0.3910
Epoch 26/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8531 - f1_score: 0.6676 - loss: 0.3340 - val_accuracy: 0.8080 - val_f1_score: 0.3819 - val_loss: 0.4236
Epoch 27/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8576 - f1_score: 0.6686 - loss: 0.3306 - val_accuracy: 0.8260 - val_f1_score: 0.3819 - val_loss: 0.3917
Epoch 28/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8586 - f1_score: 0.6670 - loss: 0.3248 - val_accuracy: 0.8280 - val_f1_score: 0.3819 - val_loss: 0.3880
Epoch 29/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8642 - f1_score: 0.6691 - loss: 0.3199 - val_accuracy: 0.8210 - val_f1_score: 0.3819 - val_loss: 0.3966
Epoch 30/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8641 - f1_score: 0.6672 - loss: 0.3149 - val_accuracy: 0.8200 - val_f1_score: 0.3819 - val_loss: 0.4035
Epoch 31/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8643 - f1_score: 0.6565 - loss: 0.3161 - val_accuracy: 0.8230 - val_f1_score: 0.3819 - val_loss: 0.4017
Epoch 32/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8625 - f1_score: 0.6665 - loss: 0.3212 - val_accuracy: 0.8220 - val_f1_score: 0.3819 - val_loss: 0.4056
Epoch 33/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8646 - f1_score: 0.6669 - loss: 0.3167 - val_accuracy: 0.8300 - val_f1_score: 0.3819 - val_loss: 0.3878
Epoch 34/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.8647 - f1_score: 0.6621 - loss: 0.3134 - val_accuracy: 0.8150 - val_f1_score: 0.3819 - val_loss: 0.4016
Epoch 35/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - accuracy: 0.8637 - f1_score: 0.6662 - loss: 0.3151 - val_accuracy: 0.8300 - val_f1_score: 0.3819 - val_loss: 0.3864
Epoch 36/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - accuracy: 0.8658 - f1_score: 0.6626 - loss: 0.3073 - val_accuracy: 0.8320 - val_f1_score: 0.3819 - val_loss: 0.3868
Epoch 37/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.8674 - f1_score: 0.6636 - loss: 0.3079 - val_accuracy: 0.8300 - val_f1_score: 0.3819 - val_loss: 0.3835
Epoch 38/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8700 - f1_score: 0.6650 - loss: 0.2974 - val_accuracy: 0.8070 - val_f1_score: 0.3819 - val_loss: 0.4268
Epoch 39/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8737 - f1_score: 0.6684 - loss: 0.2987 - val_accuracy: 0.8340 - val_f1_score: 0.3819 - val_loss: 0.3795
Epoch 40/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8736 - f1_score: 0.6660 - loss: 0.2903 - val_accuracy: 0.8310 - val_f1_score: 0.3819 - val_loss: 0.3860
Epoch 41/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8732 - f1_score: 0.6704 - loss: 0.2986 - val_accuracy: 0.8290 - val_f1_score: 0.3819 - val_loss: 0.3851
Epoch 42/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8751 - f1_score: 0.6675 - loss: 0.2964 - val_accuracy: 0.8240 - val_f1_score: 0.3819 - val_loss: 0.3881
Epoch 43/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8713 - f1_score: 0.6675 - loss: 0.2982 - val_accuracy: 0.8250 - val_f1_score: 0.3819 - val_loss: 0.3831
Epoch 44/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8737 - f1_score: 0.6590 - loss: 0.2915 - val_accuracy: 0.8270 - val_f1_score: 0.3819 - val_loss: 0.3974
Epoch 45/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8757 - f1_score: 0.6598 - loss: 0.2848 - val_accuracy: 0.8310 - val_f1_score: 0.3819 - val_loss: 0.3818
Epoch 46/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8722 - f1_score: 0.6668 - loss: 0.2943 - val_accuracy: 0.8290 - val_f1_score: 0.3819 - val_loss: 0.3857
Epoch 47/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8733 - f1_score: 0.6657 - loss: 0.2928 - val_accuracy: 0.8250 - val_f1_score: 0.3819 - val_loss: 0.3823
Epoch 48/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8729 - f1_score: 0.6629 - loss: 0.2922 - val_accuracy: 0.8350 - val_f1_score: 0.3819 - val_loss: 0.3823
Epoch 49/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 2s 5ms/step - accuracy: 0.8759 - f1_score: 0.6665 - loss: 0.2831 - val_accuracy: 0.8340 - val_f1_score: 0.3819 - val_loss: 0.3785
Epoch 50/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 6ms/step - accuracy: 0.8761 - f1_score: 0.6668 - loss: 0.2864 - val_accuracy: 0.8150 - val_f1_score: 0.3819 - val_loss: 0.3954
In [99]:
model_test_perf = model_performance_classification(model_test, X_test_sm_normalized, Y_test)
model_test_perf
32/32 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step
Out[99]:
Accuracy Recall Precision F1 Score
0 0.815 0.815 0.81473 0.814864

Actionable Insights and Business Recommendations

Actionable Insighst

  • The models shows high accurancy, recall, presicion and f1_score good generalisation showing that it can reliably predict customers whih will leave the bank.

  • Banks should leverage important features as Balance,CreditScore, Active Membership, Geography to understand reasons for customers leaving the bank.

  • Further improvements in the model can be achieved using more hypertuning paramaters tweaking like Requlaisation,Early Stopping , using class weights

Business Recommendations

  • Using the insights from the model, bank should target at-risk customers with tailored engagement strategies , ex: loyalty rewards, exclusive discounts , financial advise

  • AT-risk customers should be provided dedicated customer support with bank offerings and resolving their issues

  • Since bank provides services, using predictive insights the business team should analyse at-risk customers feeback and usage pattern and identify areas of improvements :ex: improving mobile banking features, making loan applicationn process easy

  • Bank can offer Targeted marketing Campaingns based on churn insights focusing on features used by at-risk customers

  • Tailored Strategties for specific groups like High Value customers or long termn customers or new customers who are at churn risk will also help bank retain customers

Power Ahead